This forum is about

what is the difference between frequent subgraph mining from single large graph and multiple transactional graphs?

Posted by:
**
maya
**

Date: October 28, 2018 01:22AM

Would you please tell me what is the difference between frequent subgraph mining from single large graph and multiple transactional graphs?

Best,

Maya

Best,

Maya

Posted by:
**
webmasterphilfv
**

Date: October 28, 2018 01:33AM

Hi Maya,

From what I read about frequent subgraph mining, most frequent subgraph mining like GSpan can applied to a single graph or to a database of graphs. In other words, it is not difficult to adapt a frequent subbgraph mining algorithm so that it can be applied to these two scenarios. The main difference is how the support is calculated.

Lets say that you want to calculate the support of a subgraph:

If you have a single graph, then the support of the subgraph is usually calculated as the number of occurrences of that subgraph in that single graph. Also, you may want to decide whether you allows some overlapping occurrences or not.

If you have a database of graphs, then the support of the subgraph is the number of graphs that contains this subgraph.

Thus, the main difference between these two scenarios is how the support of a subgraph is calculated. In both cases, calculating the support of a subgraph requires to find all occurrences of the subgraph. This gives you the support in the case of a single graph. In the case of multiple graphs, you also find all occurrences, but need to also check if theses occurrences appear in a same graph. Thus you need to keep the graph ID of each occurrence to calculate the support in the case of multiple graphs. It is just some simple modification to how the support is calculated.

By the way, it is also possible to modify subgraph mining algorithms to work on graphs with directed edges or undirected edges. This is also I think a simple modification.

Best regards,

Edited 1 time(s). Last edit at 10/28/2018 01:34AM by webmasterphilfv.

From what I read about frequent subgraph mining, most frequent subgraph mining like GSpan can applied to a single graph or to a database of graphs. In other words, it is not difficult to adapt a frequent subbgraph mining algorithm so that it can be applied to these two scenarios. The main difference is how the support is calculated.

Lets say that you want to calculate the support of a subgraph:

If you have a single graph, then the support of the subgraph is usually calculated as the number of occurrences of that subgraph in that single graph. Also, you may want to decide whether you allows some overlapping occurrences or not.

If you have a database of graphs, then the support of the subgraph is the number of graphs that contains this subgraph.

Thus, the main difference between these two scenarios is how the support of a subgraph is calculated. In both cases, calculating the support of a subgraph requires to find all occurrences of the subgraph. This gives you the support in the case of a single graph. In the case of multiple graphs, you also find all occurrences, but need to also check if theses occurrences appear in a same graph. Thus you need to keep the graph ID of each occurrence to calculate the support in the case of multiple graphs. It is just some simple modification to how the support is calculated.

By the way, it is also possible to modify subgraph mining algorithms to work on graphs with directed edges or undirected edges. This is also I think a simple modification.

Best regards,

Edited 1 time(s). Last edit at 10/28/2018 01:34AM by webmasterphilfv.

Posted by:
**
maya
**

Date: October 28, 2018 02:00AM

Thank you so much Dr. Philippe for your reply, I really appreciate it, actually I was so confused about this point.

Best Regards,

Maya

Best Regards,

Maya

Posted by:
**
webmasterphilfv
**

Date: October 28, 2018 02:58AM

You are welcome.

Glad to see you often on the forum ;-)

Glad to see you often on the forum ;-)

Posted by:
**
maya
**

Date: December 23, 2018 11:24PM

webmasterphilfv Wrote:

-------------------------------------------------------

>

> Thus, the main difference between these two

> scenarios is how the support of a subgraph is

> calculated. In both cases, calculating the support

> of a subgraph requires to find all occurrences of

> the subgraph. This gives you the support in the

> case of a single graph. In the case of multiple

> graphs, you also find all occurrences, but need to

> also check if theses occurrences appear in a same

> graph. Thus you need to keep the graph ID of each

> occurrence to calculate the support in the case of

> multiple graphs. It is just some simple

> modification to how the support is calculated.

>

>

> Best regards,

Hello Dr. Philfv,

In the case of multiple graphs,in the same transaction, if we find multiple occurrence of the subgraph can we consider all the occurrences or consider it as just one occurrence?

in single graph, does the overlapping allowed or not?

if you know a reference for that, please provide it!

Best Regards,

Maya

-------------------------------------------------------

>

> Thus, the main difference between these two

> scenarios is how the support of a subgraph is

> calculated. In both cases, calculating the support

> of a subgraph requires to find all occurrences of

> the subgraph. This gives you the support in the

> case of a single graph. In the case of multiple

> graphs, you also find all occurrences, but need to

> also check if theses occurrences appear in a same

> graph. Thus you need to keep the graph ID of each

> occurrence to calculate the support in the case of

> multiple graphs. It is just some simple

> modification to how the support is calculated.

>

>

> Best regards,

Hello Dr. Philfv,

In the case of multiple graphs,in the same transaction, if we find multiple occurrence of the subgraph can we consider all the occurrences or consider it as just one occurrence?

in single graph, does the overlapping allowed or not?

if you know a reference for that, please provide it!

Best Regards,

Maya