readme update (draft 3)
This commit is contained in:
parent
7fd3468d1a
commit
a33442f74c
106
README.md
106
README.md
@ -1,14 +1,15 @@
|
|||||||
# OpenANE: The first open source framework specialized in Attributed Network Embedding (ANE)
|
# OpenANE: The first open source framework specialized in Attributed Network Embedding (ANE)
|
||||||
We reproduce several ANE (Attributed Network Embedding) as well as PNE (Pure Network Embedding) methods in one framework, where they all share the same I/O and downstream tasks. We start this project based on the excellent [OpenNE](https://github.com/thunlp/OpenNE) project that integrates several PNE methods under the same framework. However, OpenANE not only integrates those PNE methods from OpenNE, but also provides the state-of-the-art ANE methods that consider both structural and attribute information during embedding.
|
We reproduce several ANE (Attributed Network Embedding) as well as PNE (Pure Network Embedding) methods in one framework, where they all share the same I/O and downstream tasks. We start this project based on the excellent project [OpenNE](https://github.com/thunlp/OpenNE) that integrates several PNE methods under the same framework. However, OpenANE not only integrates those PNE methods from OpenNE, but also provides the state-of-the-art ANE methods that consider both structural and attribute information during embedding.
|
||||||
|
|
||||||
|
Authors: Chengbin HOU chengbin.hou10@foxmail.com & Zeyu DONG 2018
|
||||||
|
|
||||||
authors: Chengbin HOU (chengbin.hou10@foxmail.com) & Zeyu DONG 2018
|
|
||||||
|
|
||||||
## Motivation
|
## Motivation
|
||||||
In many real-world scenarios, a network often comes with node attributes such as the paper's title in a citation network and user profiles in a social network. PNE methods that only consider structural information cannot make use of attribute information which may further improve the quality of node embedding.
|
In many real-world scenarios, a network often comes with node attributes such as paper metadata in a citation network and user profiles in a social network. PNE methods that only consider structural information cannot make use of attribute information that may further improve the quality of node embeddings.
|
||||||
|
<br> From engineering perspective, by offering more APIs to handle attribute information in graph.py and utils.py, OpenANE shall be very easy to use for embedding an attributed network. Of course, OpenANE can also deal with pure network by calling PNE methods, or by assigning all ones as the attributes and then calling ANE methods.
|
||||||
|
|
||||||
From engineering perspective, by offering more APIs to handle attribute information in graph.py and utils.py, OpenANE shall be very easy to use for embedding an attributed network. Of course, OpenANE can also deal with pure network: 1) by calling PNE methods; and 2) by assigning ones as the attribute for all nodes and then calling ANE methods (but some ANE methods may fail).
|
|
||||||
|
|
||||||
## Methods (todo... Chengbin)
|
## Methods
|
||||||
[ABRW](https://github.com/houchengbin/ABRW),
|
[ABRW](https://github.com/houchengbin/ABRW),
|
||||||
[SAGE-GCN](https://github.com/williamleif/GraphSAGE),
|
[SAGE-GCN](https://github.com/williamleif/GraphSAGE),
|
||||||
[SAGE-Mean](https://github.com/williamleif/GraphSAGE),
|
[SAGE-Mean](https://github.com/williamleif/GraphSAGE),
|
||||||
@ -21,60 +22,38 @@ From engineering perspective, by offering more APIs to handle attribute informat
|
|||||||
[GraRep](https://github.com/thunlp/OpenNE),
|
[GraRep](https://github.com/thunlp/OpenNE),
|
||||||
AttrPure,
|
AttrPure,
|
||||||
AttrComb,
|
AttrComb,
|
||||||
|
<br> Note: all NE methods in this framework are unsupervised.
|
||||||
|
|
||||||
## Requirements (todo... Zeyu; double check)
|
**For more details of each method, please have a look at our paper https://arxiv.org/abs/1811.11728**
|
||||||
|
<br> And if you find ABRW or this framework is useful for your research, please consider citing it.
|
||||||
|
|
||||||
|
|
||||||
|
## Usages
|
||||||
|
#### Requirements
|
||||||
```bash
|
```bash
|
||||||
pip install -r requirements.txt
|
pip install -r requirements.txt
|
||||||
```
|
```
|
||||||
Python 3.6 or above is required due to the [new print(f' ') feature](https://docs.python.org/3.6/reference/lexical_analysis.html#f-strings)
|
Python 3.6 or above is required due to the [new print(f' ') feature](https://docs.python.org/3.6/reference/lexical_analysis.html#f-strings)
|
||||||
|
#### To obtain node embeddings as well as evaluate the quality by default tasks
|
||||||
## Usages
|
|
||||||
#### To obtain your node embeddings and evaluate them by classification downstream tasks
|
|
||||||
```bash
|
```bash
|
||||||
python src/main.py --method abrw --emb-file cora_abrw_emb --save-emb
|
python src/main.py --method abrw --emb-file cora_abrw_emb --save-emb
|
||||||
```
|
```
|
||||||
#### To have an intuitive feeling of node embeddings (todo... Zeyu if possible; need tf installed)
|
#### To have an intuitive feeling in node embeddings
|
||||||
```bash
|
```bash
|
||||||
python src/viz.py --emb-file cora_abrw_emb --label-file data/cora_label
|
python src/viz.py --emb-file cora_abrw_emb --label-file data/cora_label
|
||||||
```
|
```
|
||||||
|
|
||||||
## Parameters
|
|
||||||
#### the meaning of each parameter
|
|
||||||
please see main.py
|
|
||||||
#### searching optimal value of parameter (todo... Chengbin)
|
|
||||||
ABRW
|
|
||||||
SAGE-GCN
|
|
||||||
|
|
||||||
## Testing
|
|
||||||
|
|
||||||
|
## Testing (Cora)
|
||||||
### Parameter Setting
|
### Parameter Setting
|
||||||
|
Currently, we use the default parameters
|
||||||
Currently, we use the default parameter
|
|
||||||
|
|
||||||
| AANE_lamb | AANE_maxiter | AANE_rho | ABRW_alpha | ABRW_topk | ASNE_lamb | AttrComb_mode | GraRep_kstep | LINE_negative_ratio | LINE_order | Node2Vec_p | Node2Vec_q | TADW_lamb | TADW_maxiter | batch_size | dim | dropout | epochs | label_reserved | learning_rate | link_remove | number_walks | walk_length | weight_decay | window_size | workers |
|
| AANE_lamb | AANE_maxiter | AANE_rho | ABRW_alpha | ABRW_topk | ASNE_lamb | AttrComb_mode | GraRep_kstep | LINE_negative_ratio | LINE_order | Node2Vec_p | Node2Vec_q | TADW_lamb | TADW_maxiter | batch_size | dim | dropout | epochs | label_reserved | learning_rate | link_remove | number_walks | walk_length | weight_decay | window_size | workers |
|
||||||
|-----------|--------------|----------|------------|-----------|-----------|---------------|--------------|---------------------|------------|------------|------------|-----------|--------------|------------|-----|---------|--------|----------------|---------------|-------------|--------------|-------------|--------------|-------------|---------|
|
|-----------|--------------|----------|------------|-----------|-----------|---------------|--------------|---------------------|------------|------------|------------|-----------|--------------|------------|-----|---------|--------|----------------|---------------|-------------|--------------|-------------|--------------|-------------|---------|
|
||||||
| 0.05 | 10 | 5 | 0.8 | 30 | 1 | concat | 4 | 5 | 3 | 0.5 | 0.5 | 0.2 | 10 | 128 | 128 | 0.5 | 100 | 0.7 | 0.001 | 0.1 | 10 | 80 | 0.0001 | 10 | 24 |
|
| 0.05 | 10 | 5 | 0.8 | 30 | 1 | concat | 4 | 5 | 3 | 0.5 | 0.5 | 0.2 | 10 | 128 | 128 | 0.5 | 100 | 0.7 | 0.001 | 0.1 | 10 | 80 | 0.0001 | 10 | 24 |
|
||||||
|
|
||||||
### Testing Result
|
### Testing Result
|
||||||
|
#### Link Prediction and Node Classification tasks:
|
||||||
**citeseer**:
|
|
||||||
|
|
||||||
| method | AUC | Micro-F1 | Macro-F1 | Time |
|
|
||||||
|----------|--------|----------|----------|----------|
|
|
||||||
| aane | 0.8889 | 0.7067 | 0.6295 | 36.99 |
|
|
||||||
| abrw | 0.9342 | 0.7287 | 0.6705 | 64.32 |
|
|
||||||
| asne | 0.8293 | 0.5275 | 0.4736 | 70.89 |
|
|
||||||
| attrcomb | 0.8756 | 0.7077 | 0.6592 | 99.19 |
|
|
||||||
| attrpure | 0.8684 | 0.6922 | 0.6525 | 0.99 |
|
|
||||||
| deepwalk | 0.7203 | 0.5681 | 0.5205 | 93.14 |
|
|
||||||
| grarep | 0.8501 | 0.5200 | 0.4656 | 16.48 |
|
|
||||||
| line | 0.6340 | 0.3959 | 0.3503 | 242.59 |
|
|
||||||
| node2vec | 0.6588 | 0.5931 | 0.5493 | 27.60 |
|
|
||||||
| sagegcn | 0.8953 | 0.6016 | 0.5247 | 444.50 |
|
|
||||||
| sagemean | 0.8772 | 0.6391 | 0.5606 | 371.74 |
|
|
||||||
| tadw | 0.8984 | 0.7337 | 0.6866 | 14.39 |
|
|
||||||
|
|
||||||
**cora:**
|
|
||||||
|
|
||||||
| method | AUC | Micro-F1 | Macro-F1 | Time |
|
| method | AUC | Micro-F1 | Macro-F1 | Time |
|
||||||
|----------|--------|----------|----------|----------|
|
|----------|--------|----------|----------|----------|
|
||||||
@ -91,37 +70,32 @@ Currently, we use the default parameter
|
|||||||
| sagemean | 0.8882 | 0.8057 | 0.7902 | 183.65 |
|
| sagemean | 0.8882 | 0.8057 | 0.7902 | 183.65 |
|
||||||
| tadw | 0.9005 | 0.8383 | 0.8255 | 10.73 |
|
| tadw | 0.9005 | 0.8383 | 0.8255 | 10.73 |
|
||||||
|
|
||||||
**mit:**
|
#### Visualization task
|
||||||
|
2D visualization results of the node embeddings on Cora dataset;
|
||||||
|
<br> Steps: Cora -> NE method -> node embeddings -> PCA -> 2D viz;
|
||||||
|
<br> The different colors indicate different ground truth labels;
|
||||||
|
<br> ![Cora viz](https://github.com/houchengbin/OpenANE/blob/master/log/viz.jpg)
|
||||||
|
|
||||||
| method | AUC | Micro-F1 | Macro-F1 | Time |
|
## Other Datasets
|
||||||
|----------|--------|----------|----------|----------|
|
More well-prepared (attributed) network datasets are available at [NetEmb-Datasets](https://github.com/houchengbin/NetEmb-datasets)
|
||||||
| aane | 0.6586 | 0.3742 | 0.0730 | 83.49 |
|
|
||||||
| abrw | 0.9068 | 0.7981 | 0.2286 | 113.41 |
|
|
||||||
| asne | 0.6596 | 0.3041 | 0.0681 | 901.18 |
|
|
||||||
| attrcomb | 0.8548 | 0.8016 | 0.2223 | 125.82 |
|
|
||||||
| attrpure | 0.6464 | 0.3497 | 0.0707 | 1.19 |
|
|
||||||
| deepwalk | 0.9190 | 0.7997 | 0.2295 | 173.70 |
|
|
||||||
| grarep | 0.8983 | 0.7695 | 0.1901 | 51.55 |
|
|
||||||
| line | 0.7836 | 0.7415 | 0.1857 | 6335.43 |
|
|
||||||
| node2vec | 0.9088 | 0.8085 | 0.2356 | 465.54 |
|
|
||||||
| sagegcn | 0.8005 | 0.6057 | 0.1451 | 12462.35 |
|
|
||||||
| sagemean | 0.7525 | 0.5796 | 0.1279 | 10534.44 |
|
|
||||||
|
|
||||||
## Datasets (todo...Chengbin)
|
### Your own dataset
|
||||||
We provide Cora for ... and other datasets e.g. Facebook_MIT, refer to [NetEmb-Datasets](https://github.com/houchengbin/NetEmb-datasets)
|
**FILE for structural information (each row):**
|
||||||
|
<br> adjlist: node_id1 node_id2 node_id3 -> (the edges between (id1, id2) and (id1, id3))
|
||||||
|
<br> OR edgelist: node_id1 node_id2 weight(optional) -> one edge (id1, id2)
|
||||||
|
<br> **FILE for attribute information (each row):**
|
||||||
|
<br> node_id1 attr1 attr2 ... attrM
|
||||||
|
<br> **FILE for label (each row):**
|
||||||
|
<br> node_id1 label(s)
|
||||||
|
|
||||||
### Your own dataset?
|
### Parameters Tuning
|
||||||
#### FILE for structural information (each row):
|
For different dataset, one may need to search the optimal parameters instead of taking the default parameters.
|
||||||
adjlist: node_id1 node_id2 node_id3 -> (the edges between (id1, id2) and (id1, id3))
|
For the meaning and suggestion of each parameter, please see main.py.
|
||||||
|
|
||||||
OR edgelist: node_id1 node_id2 weight(optional) -> one edge (id1, id2)
|
|
||||||
#### FILE for attribute information (each row):
|
|
||||||
node_id1 attr1 attr2 ... attrM
|
|
||||||
|
|
||||||
#### FILE for label (each row):
|
## Want to contribute
|
||||||
node_id1 label(s)
|
|
||||||
|
|
||||||
## Want to contribute?
|
|
||||||
We highly welcome and appreciate your contributions on fixing bugs, reproducing new ANE methods, etc. And together, we hope this OpenANE framework would become influential on both academic research and industrial usage.
|
We highly welcome and appreciate your contributions on fixing bugs, reproducing new ANE methods, etc. And together, we hope this OpenANE framework would become influential on both academic research and industrial usage.
|
||||||
|
|
||||||
## Recommended References (todo... Chengbin)
|
|
||||||
|
## References
|
||||||
|
todo...
|
||||||
|
Binary file not shown.
BIN
log/viz.jpg
BIN
log/viz.jpg
Binary file not shown.
Before Width: | Height: | Size: 244 KiB After Width: | Height: | Size: 201 KiB |
Loading…
Reference in New Issue
Block a user