This forum is about

what is the difference between frequent subgraph mining from single large graph and multiple transactional graphs?

Posted by:
**
maya
**

Date: October 28, 2018 01:22AM

Would you please tell me what is the difference between frequent subgraph mining from single large graph and multiple transactional graphs?

Best,

Maya

Best,

Maya

Posted by:
**
webmasterphilfv
**

Date: October 28, 2018 01:33AM

Hi Maya,

From what I read about frequent subgraph mining, most frequent subgraph mining like GSpan can applied to a single graph or to a database of graphs. In other words, it is not difficult to adapt a frequent subbgraph mining algorithm so that it can be applied to these two scenarios. The main difference is how the support is calculated.

Lets say that you want to calculate the support of a subgraph:

If you have a single graph, then the support of the subgraph is usually calculated as the number of occurrences of that subgraph in that single graph. Also, you may want to decide whether you allows some overlapping occurrences or not.

If you have a database of graphs, then the support of the subgraph is the number of graphs that contains this subgraph.

Thus, the main difference between these two scenarios is how the support of a subgraph is calculated. In both cases, calculating the support of a subgraph requires to find all occurrences of the subgraph. This gives you the support in the case of a single graph. In the case of multiple graphs, you also find all occurrences, but need to also check if theses occurrences appear in a same graph. Thus you need to keep the graph ID of each occurrence to calculate the support in the case of multiple graphs. It is just some simple modification to how the support is calculated.

By the way, it is also possible to modify subgraph mining algorithms to work on graphs with directed edges or undirected edges. This is also I think a simple modification.

Best regards,

Edited 1 time(s). Last edit at 10/28/2018 01:34AM by webmasterphilfv.

From what I read about frequent subgraph mining, most frequent subgraph mining like GSpan can applied to a single graph or to a database of graphs. In other words, it is not difficult to adapt a frequent subbgraph mining algorithm so that it can be applied to these two scenarios. The main difference is how the support is calculated.

Lets say that you want to calculate the support of a subgraph:

If you have a single graph, then the support of the subgraph is usually calculated as the number of occurrences of that subgraph in that single graph. Also, you may want to decide whether you allows some overlapping occurrences or not.

If you have a database of graphs, then the support of the subgraph is the number of graphs that contains this subgraph.

Thus, the main difference between these two scenarios is how the support of a subgraph is calculated. In both cases, calculating the support of a subgraph requires to find all occurrences of the subgraph. This gives you the support in the case of a single graph. In the case of multiple graphs, you also find all occurrences, but need to also check if theses occurrences appear in a same graph. Thus you need to keep the graph ID of each occurrence to calculate the support in the case of multiple graphs. It is just some simple modification to how the support is calculated.

By the way, it is also possible to modify subgraph mining algorithms to work on graphs with directed edges or undirected edges. This is also I think a simple modification.

Best regards,

Edited 1 time(s). Last edit at 10/28/2018 01:34AM by webmasterphilfv.

Posted by:
**
maya
**

Date: October 28, 2018 02:00AM

Thank you so much Dr. Philippe for your reply, I really appreciate it, actually I was so confused about this point.

Best Regards,

Maya

Best Regards,

Maya

Posted by:
**
webmasterphilfv
**

Date: October 28, 2018 02:58AM

You are welcome.

Glad to see you often on the forum ;-)

Glad to see you often on the forum ;-)