Contributor Guidelines
If you are interested in contributing to GLI, your contributions will likely fall into one of the following three categories:
You want to contribute a new dataset/task.
You want to implement a new feature.
You want to fix a bug.
Developing GLI
To develop GLI on your machine, here are some tips:
Clone a copy of GLI from source:
git clone https://github.com/Graph-Learning-Benchmarks/gli.git cd gli
If you already cloned GLI from source, update it:
git pull
Install GLI with full dependencies:
pip install -e ".[test,full]"
This mode will symlink the Python files from the current local source tree into the Python install. Hence, if you modify a Python file, you do not need to reinstall GLI again.
Run an example:
python3 example.py
This script will load the
NodeClassification
task oncora
dataset.Ensure your installation is correct by running the entire test suite with
make pytest
Contributing A New Dataset
Here is a checklist for a new dataset. Please open a pull request that contains a directory in following format:
datasets/<name>
├── <name>.ipynb/<name>.py
├── README.md
├── LICENSE
├── metadata.json
├── task_<task_type>.json
├── ... # There might be multiple task configurations.
└── urls.json
where <name>
is the dataset name and <task_type>
is one of the
given tasks defined in GLI Task Format.
<name>.ipynb/<name>.py
: A Jupyter Notebook or Python script that converts the original dataset into GLI format.README.md
: A document that contains the necessary information about the dataset and task(s), including description, citation(s), available task(s), and extra required packages for<name>.ipynb/<name>.py
.LICENSE
: A license file that is used by the current dataset maintainer.metadata.json
: A json configuration file that stores the metadata of the graph dataset. See GLI Data Format.task_<task_type>.json
: A task configuration file that stores an available task on the given dataset. See GLI Task Format. Contributors can define multiple tasks on the same dataset. If the task type is the same, usetask_<task_type>_<id>.json
to distinguish between same tasks, where<id>
should be replaced by 1, 2, etc.urls.json
: A url configuration file that stores the downloading urls of the uploaded files.
Uploading GLI Data Files
Please upload the npz or npy files referred in metadata.json
or
task_<task_type>.json
to dropbox and include the public download links in urls.json
. Due to anonymous requirement, the link is hidden in this document for now.
Reporting Bugs
Please feel free to report a bug through Issues and/or open a pull request to implement it. Please provide a clear and concise description of what the bug was. If you are unsure about if this is a bug at all or how to fix, post about it in an issue.
Implementing New Features
Please feel free to request a new feature through Issues and/or open a pull request to implement it. In general, we accept any features as long as they fit the scope of this package. If you are unsure about this or need help on the design/implementation of your feature, post about it in an issue.