Releases: pykeen/pykeen
v1.10.2
What's Changed
- 🧰♻️ Add Dependency Caching by @mberr in #1239
- ⚖️🗺️ Fix estimate diversity function by @dobraczka in #1242
- 👋🌐 Update Hello World Notebook by @mberr in #1249
- 💾📉 Improve Memory-Efficiency of Converting OGB Datasets to TriplesFactory by @mberr in #1253
- 🔧✍️ Fix CachedTextRepresentation.from_dataset by @mberr in #1259
- 🧬 💾 Add the PharMeBINet dataset by @sbonner0 in #1257
- ⚗️🎯 Fix norm limit regularizers hpo defaults by @mberr in #1274
- 🤔🙅♀️ Fix TransH HPO compatability by @mberr in #1277
- 🔥🪞 Fix relation prediction with inverses by @mberr in #1304
- 🧹🧑⚖️ add gc_after_trial parameter for optuna by @LizzAlice in #1301
- 🐏🚀 Use torch_max_mem for automatic memory optimization in Evaluator by @mberr in #1261
- ⚓
↕️ CSGraphAnchorSearcher: Ensure top-k index array is sorted by @mberr in #1318 - 🔥🧩 Fix NodePiece & Complex Embeddings by @mberr in #1288
- 🥄🧩 Utility to create inductive NodePiece representations by @mberr in #1322
- 🍂🚮 Ensure cleanup of temporary files by @mberr in #1307
↔️ ⭐ Merge BoxE interaction by @mberr in #1180- 🌅🔪 Early Slicing for Lazy Target Representations by @mberr in #1321
- 🔧🔮 Fix predict_all for inductive inference by @mberr in #1320
- 🏗️🌊 Support resolving non-rank-based metrics by @mberr in #1237
- 📖✨ Update readme by @mberr in #968
⁉️ ⭕ Raise error when encountering a checkpoint_name in the HPO configuration by @mberr in #1324- 📉☎️ Validation Loss Training Callback by @mberr in #1169
- Relax protobuf requirement by @cthoyt in #1332
- ✅📉 Check Losses' HPO Defaults by @mberr in #1333
- 🔝📐 Bump PyTorch Geometric cpu-build version for CI by @mberr in #1336
- ☑️💸 Update Losses' HPO default range checks by @mberr in #1337
- 🎣🏆 Repo cleanup and fix RGCN's hpo_default by @mberr in #1370
New Contributors
- @LizzAlice made their first contribution in #1301
Full Changelog: v1.10.1...v1.10.2
v1.10.1
What's Changed
- 📚🛗 Update Prediction Migration Guide by @mberr in #1233
- 🚚🪞Make sure inductive representation is on same device by @dobraczka in #1229
Full Changelog: v1.10.0...v1.10.1
v1.10.0
The PyKEEN 1.10 release contains a huge variety of bug fixes, performance improvements, and new features. A few highlights include symmetric sLCWA training loop, evaluation with OGB, biomedical entity representation modules, low-rank representation approximation, and many improvements to the prediction pipeline.
Models and Layers
- 🔧
✈️ Fix TransH by @mberr in #1057 - ♚♔ Update ConvKB interaction by @mberr in #1065
- 🐸🫖 Fix QuatE by @mberr in #1056
- 🐸🫖 Fix failing QuatE tests by @mberr in #1070
- 🩹💯 Fix typo in OpenEA graph size by @dobraczka in #1073
- 🌑⛱️ Inductive ERModel base class by @mberr in #1106
- 🏋️ 🔧 Kwargs fix for edge weighting in
CompGCNLayer
by @migalkin in #1138 ↔️ ⭐ Merge ComplEx interaction by @mberr in #1103↔️ ⭐ Merge AutoSF interaction by @mberr in #1112- 🦷 🔄 Extract inversion utilities by @cthoyt in #1203
NodePiece
- 📿 🧩 Support disconnected nodes in the Relation tokenizer by @migalkin in #1064
- ⚙️🔢 Fix InductiveNodePiece for parametric aggregations by @mberr in #1104
Documentation
Performance
- 🍭 🚀 Add optional dep for opt_einsum by @mberr in #1058
- ⚽ 🔧 Properly shuttle tensors to CPU in rank-based evaluator by @Allaway11 in #1076
- 🍭🧑🚀 Update remaining einsum usages by @mberr in #1068
- ♻️⚙️ Re-use AMO for predict_triples_df by @mberr in #1089
Pipeline and Prediction
- 🔮☕ Custom Prediction Filtering by @mberr in #1090
- 🪓💻 Split Pipeline Code by @mberr in #1075
- 🔮🤔 Upgrade Prediction Consumers by @mberr in #1078
- 🌀 🏴☠️ Clean up inverse triples arguments by @cthoyt in #1092
- 🐛🐔 Fixing bug with index-error for predict.consume_scores by @AImenes in #1157
Representation
- 🔵🟦 Approximate other representation with Low-Rank by @mberr in #1091
- 🗿📖 Update text-based representation tutorial by @mberr in #1147
- 📜🔧 Fix labels not being converted to list by @mberr in #1209
- 🪞🧬 Text representation for biomedical entities via PyOBO by @cthoyt in #1055
Training and Negative Sampling
☺️ 🫖 Relax Slicing Check by @mberr in #1216- 🪞🚆 Symmetric LCWA by @mberr in #1098
- 🫴🔋 Fix to pass any negative sampler kwargs through if present by @sbonner0 in #1119
- 💾🚆 Fix re-loading training triples by @mberr in #1185
Loss
- Remove unnecessary unsqueezing in MRL by @sbonner0 in #1128
- 📖 🛠️ Fix losses docs typos by @nicolafan in #1200
Metrics and Evaluation
- 🔍📈 Fix MacroRankBasedEvaluator share precomputed weights for different triples by @mberr in #1079
- 🧮🪞 Rank-Based Metrics with Confidence Estimates via Bootstrapping by @mberr in #1084
- Incremental ranks by @mberr in #1083
- 😮🪑 Add basic support of OGB's evaluation by @mberr in #1088
- 😮🪑 Add OGB Evaluator by @mberr in #948
- 🌪️➰ Evaluation loop and filter triples by @mberr in #1214
- ⛳💌 Separately Expose Clearing of Intermediate Evaluator Results by @mberr in #1195
Lightning
- ⚡🩹 Fix lightning training without validation by @mberr in #1158
- ⚡🧪 Bring back lightning tests by @mberr in #1191
Misc
- 📌🖼️ Fix plotting utility by @mberr in #1072
- ☁️☀️ Fix word cloud by @mberr in #1081
- ✂️⌨️ Remove unused code by @mberr in #1095
- 💲🐚 Remove $ sign from README shell examples by @mberr in #1101
- 🩹🧑🏫 Fix scikit-learn dependency by @dobraczka in #1122
- 👷🚇 Fix CI failure due to new versions by @mberr in #1170
- 🩹🪟 Fix Windows installation instructions by @jamesmyatt in #1164
- 🧹🐻 Cleanup before 1.9 release by @cthoyt in #1204
- 🩹⚙️ Include result_tracker field in dict returned by _get_best_study_config() by @AntonisKl in #1206
New Contributors
- @Allaway11 made their first contribution in #1076
- @jamesmyatt made their first contribution in #1164
- @AImenes made their first contribution in #1157
- @nicolafan made their first contribution in #1200
- @AntonisKl made their first contribution in #1206
Full Changelog: v1.9.0...v1.10.0
v1.9.0
The theme of this release of PyKEEN is centered on new and exciting representations to bring more kinds of data (text, image, scalar data) into training in an elegant way. Several of these contribute to new functionality for NodePiece.
Training and Evaluation
- 🔬🔁 Evaluation loop by @mberr in #768
- 🐦🛑 Early stopping: Reload weights from best epoch by @mberr in #961
- 🔬🚪 Update evaluator's evaluate to pass through kwargs by @mberr in #938
- 🌪😿 Fix epoch loss by @mberr in #1021
Datasets
- 🥨🕸️ Add Global Biotic Interactions (GloBI) dataset by @cthoyt in #947
- Fix dataset caching with inverse triples by @mberr in #1034
Models
New
Updates
- 📏🤝 Add LineaRE interaction by @mberr in #971
- ✨🤖 Update ERMLP to ERModel by @mberr in #869
- ✨🤖 Update ERMLP-E to ERModel by @mberr in #872
- ✨🤖 Update HolE to ER-Model by @mberr in #953
- ✨🤖 Update TransE to ER-Model by @mberr in #955
- ✨🤖 Update TransH to ER-Model by @mberr in #954
- ✨🤖 Update RESCAL to ER-Model by @mberr in #952
- ✨💀 Phase out old-style model by @mberr in #865
- 🫶 🧨 Update ConvKB & SE to use einsum by @mberr in #978
- 🏎️ 🏴☠️ Add efficient RGCN implementation by @mberr in #634
- 🔧➡️ Move Nguyen's TransE configurations into correct directory by @PhaelIshall in #957
Representations
- 👥📝 Wikidata Textual Representations by @mberr in #966
- 🔗🗿 Combined Representation by @mberr in #964
- 🔳🔲 Add PartitionRepresentation by @mberr in #980
- 📋❇️ Generalize Text Encoders & add a simple one by @mberr in #969
- 🚚🗽 Add transformed representation by @mberr in #984
- 👀📇 Simple visual representations by @mberr in #965
- 🏋️🚂 Tensor Train Representation by @mberr in #989
NodePiece
- ⚓🔍 NodePiece: GPU-enabled BFS searcher by @migalkin in #990
- 🏴☠️🌊 NodePiece x METIS by @mberr in #988
- ⚓ 📖 NodePiece documentation on
MetisAnchorTokenizer
by @migalkin in #1026
Documentation
- 😺🇪🇬 Explicitly set Sphinx language by @mberr in #951
- 💥 📒 Add troubleshooting for loading old models by @jas-ho in #963
- 📘 🚀 Update README by @cthoyt in #1039
- 📒 🤡 Fix documentation build by @cthoyt in #946
- 📕 🏴☠️ Update docs and deprecations by @mberr in #979
- 📗 🖊️ Update docs about normalizers and constrainers by @mberr in #1047
Loss
- ⚔️⚖️ Add adversarially weighted BCE loss by @mberr in #958
- ⚔️🤔 New procedure for computing AdversarialBCEWithLogits by @migalkin in #997
Predictions
- 🐉🐉 Score multiple tails at once by @mberr in #949
- 🔮〰️ Update Prediction Filtering by @mberr in #1048
- 🔮 🎉 Add inference_mode annotation to get_prediction_df() by @tatiana-iazykova in #1024
- 🔨🧪 Fix the device in
_safe_evaluate()
by @migalkin in #1041
Meta
Misc
- 🚨📊 Cast kwargs as strings in plot_er by @vsocrates in #945
- ⛏📲 Add utility to analyze degree distributions by @mberr in #857
- 🥯✔️ Add max_id/shape verification by @mberr in #983
- Use
torch_ppr
by @mberr in #995 - ➕🍹 Add ExtraReprMixin by @mberr in #994
- 🛤️🛢️ Add prefix when tracking pipeline metrics by @mberr in #998
- ⭕🔺 Update PyG version for CI by @mberr in #1025
- #️⃣🐍 Allow passing numpy.ndarray to CoreTriplesFactory by @mberr in #1029
New Contributors
- @PhaelIshall made their first contribution in #957
- @jas-ho made their first contribution in #963
- @tatiana-iazykova made their first contribution in #1024
Full Changelog: v1.8.2...v1.9.0
v1.8.2
Datasets
- Add the PrimeKG dataset by @sbonner0 in #915
- 🌀🔗 Extend EA datasets to allow loading a unified graph by @mberr in #871
- 🎺🎷 Fix wk3l loading by @mberr in #907
Lightning
- 🔥⚡ PyTorch Lightning by @mberr in #905
- 🔥⚡ PyTorch Lightning - Part 2 by @mberr in #917
- 🚅⚡ Test Training with PyTorch Lightning by @mberr in #930
Losses
- 📉🧑🤝🧑 Fix default loss of PairRE by @mberr in #925
- ℹ️🦭 Add InfoNCE loss by @mberr in #926
- ℹ️🚀 Update InfoNCE LCWA implementation by @mberr in #928
Representations
- 🎲🚶 Random Walk Positional Encoding by @mberr in #918
- 🏛️👨 Weisfeiler-Lehman Features by @mberr in #920
Other great stuff that isn't the previous commit (it's after 5PM)
- 🧫🐍 Update scipy minimum version by @mberr in #891
- ♻️☎️ Re-use optimized batch-size in evaluation callback by @mberr in #886
- 🖥️🦎 Fix complex initialization by @mberr in #888
- 📦📚 Update BoxE reproducibility configurations by @mberr in #631
- 🫓🪁 Improve loading of triples with
nan
strings by @SenJia in #883 - 🪵 ✨ Update flake8 ignores by @cthoyt in #897
- 👯♂️👯♀️ Unique hashes in the NodePiece representation by @migalkin in #896
- 📐📨 PyTorch Geometric Message Passing Representations by @mberr in #894
- 🪛📁 Fix directory path normalization by @mberr in #890
- 🧛🇪🇺 Implement more graph pair unification approaches by @mberr in #893
- 🔙🌙 Backwards Compatibility for init phases by @mberr in #899
- 📔✅ Update Docstring Coverage check by @mberr in #892
- 🪄🖊️ Class resolver type annotations by @mberr in #904
- 📋➡️ Move listing experiments from epilog to own command by @mberr in #903
- 🔧📜 Update hpo tutorial about grid search by @mberr in #902
- 📖 🛠️ Fix typo in prediction docs by @mberr in #912
- ✂️🌰 Extract triple-independent information from CoreTriplesFactory by @mberr in #908
- 🐍👍 Increase Minimum Python Version to 3.8 by @mberr in #921
- 🧚💾 Extend save to directory doc by @mberr in #916
- 🧠🏷️ Maximize memory utilization for label based initialization by @mberr in #898
- ✏️🇮🇳 Rename inductive representation methods by @mberr in #929
- 👾 ⚽ Add missing device by @vsocrates in #936
New Contributors
- @SenJia made their first contribution in #883
- @vsocrates made their first contribution in #936
Full Changelog: v1.8.1...v1.8.2
v1.8.1
PyKEEN 1.8.1 contains a few critical bug fixes along with some other cool updates.
Evaluation
Inductive Models
Transductive Models
- ✨🤖 Update DistMult to ERModel by @mberr in #874
- ✨🤖 Update ProjE to ERModel by @mberr in #876
- ✨🤖 Update RotatE to ERModel by @mberr in #877
- ✨🤖 Update ConvE to ERModel by @mberr in #875
- 🚛®️ Update TuckER to ERModel by @mberr in #866
- ✨🦜 Upgrade TransR to ERModel by @mberr in #868
New Datasets
- 🌪️ 📖 Add ILPC datasets and inductive dataset resolver by @cthoyt in #848
- 👑🤑 Add aristo-v4 dataset by @mberr in #855
Documentation
- 📗✨ Update documentation to better reflect new-style models by @mberr in #879
- 👣 📚 Correct typos in "First Steps" tutorial by @andreasala98 in #846
Bug Fixes
- 🔧#️⃣ Fix arange dtype and clip variances by @mberr in #881
- 🪄⚖️ Fix pop_regularization_term by @mberr in #849
- 🧑🏭🔢 Fix numeric triples factory by @mberr in #862
- 🍔 🪓 Ensure reproducible splits for all datasets by @mberr in #856
- 🚫🏋 Raise explicit error if no training batch was available by @mberr in #860
- 🚚💻 Fix TransformerEncoder tokens' device by @mberr in #861
Misc
- 🔎🏋️ Resolve optimizer, LR-scheduler & tracker in training loop by @mberr in #852
- 🎯🪜 Update default batch size HPO range by @mberr in #864
- ♻️🔥 Use torch builtin broadcast by @mberr in #873
New Contributors
- @andreasala98 made their first contribution in #846
Full Changelog: v1.8.0...v1.8.1
v1.8.0
Among a ton of updates since the beginning of the year, PyKEEN v1.8.0 has three major themes:
- The introduction of the inductive link prediction pipeline and the NodePiece model. We highly suggest checking out An Open Challenge for Inductive Link Prediction on Knowledge Graphs to go along with this new pipeline and models.
- The introduction of new rank-based evaluation metrics to go along with A Unified Framework for Rank-based Evaluation Metrics for Link Prediction in Knowledge Graphs
- Major internal refactoring of negative sampling to better use PyTorch's data loaders and support multi-CPU generation (special thanks to @Koenkalle for help testing this)
NodePiece and Inductive Link Prediction
- 🦆🐍 Inductive LP framework by @migalkin in #722
- 🌙🐺 Add
mode
parameter by @cthoyt in #769 - 🍸✌️ Mixed tokenization for NodePiece by @mberr in #770
- ☮️⚓ NodePiece with anchors by @mberr in #755
- 🥒🏴☠️ Precomputed Tokenization for NodePiece by @mberr in #822
- 🦜📖 Refactor NodePiece and improve documentation by @cthoyt in #833
- ⚓ 🔧 NodePiece MixtureAnchorSelection unique anchor IDs fix + PageRank fix by @migalkin in #776
- 🧩🧪 NodePiece experimental configs by @migalkin in #771
- 👀 🏋️ Attention edge weighting by @migalkin in #734
Models
New
- ⛟
↔️ Add (multi-)linear Tucker interaction by @mberr in #751 - 🍦 🎸 Soft inverse triples baseline by @mberr in #543
Updated
- 🤖 ⛑️ Fix device for FixedModel by @mberr in #725
- 🧀 🏹 Unify usage of
slice_size
by @cthoyt in #729 - 📟 ✂️ Remove device from model by @cthoyt in #730
- 💃 🪥 Cleanup model argument passing by @cthoyt in #762
Training and Evaluation
- ✂️ ⏰ Split early stopping logic from evaluation by @mberr in #355
- 🎲🎚️ Sampled Rank-Based Evaluator by @mberr in #733
- ⛏©️ Fix Checkpointing by @mberr in #740
- 🌋 🗺️ Switch evaluator from dataclass to dict by @cthoyt in #780
- 🌀 ⚖️ Simplify evaluate by @mberr in #767
- 🛤️ 🔁 Store result tracker inside loop by @mberr in #793
Callbacks
- 📞🔙 Evaluation callback by @mberr in #765
- 🥊 ☎️ Early stopping via training callback by @mberr in #354
Data and Datasets
New
- 🪢🤔 Add OpenEA datasets by @dobraczka in #784
- 💉8️⃣ Add the PharmKG8k dataset by @sbonner0 in #797
- 🧪💉 Add PharmKG full dataset by @sbonner0 in #806
- ♻️ 2️⃣ Replace OGB's WikiKG by WikiKG2 by @mberr in #809
Updates
- 📌🧠 Use pinned memory for training data loader by @mberr in #747
- 🧮💃 Add property for number of parameters by @mberr in #804
- 💾 ♻️ Refactor dataset utility code by @cthoyt in #830
- 💾 🐕 Update dataset registration by @cthoyt in #832
- 💾 🚀 Update dataset statistics by @cthoyt in #834
- 😴🐲 Ignore create_inverse_triples for caching hash digest by @mberr in #813
- 🖇️ 📊 Use Figshare link for OpenEA dataset by @dobraczka in #838
- 📦 💾 Batch data loader by @mberr in #817
- 📥 🏭 Save Training Triples Factory by @mali-git in #655
- 🦞 💿 Negative sampling in data loader by @mberr in #417
- 💾💽 Change serialization format by @mberr in #785
- 🧰 📥 Cache dataset loading by @mberr in #569
Metrics
- 📏🪕 Compute Candidate Set Sizes by @mberr in #732
- 🏆 🍱 Update rank data structure by @cthoyt in #758
- 📐 🍱 Update metric key data structure by @cthoyt in #759
- 🐳 😃 Reorganize metrics and expectation functions by @cthoyt in #763
- 🏛️ 👽 Add improved indicator constructor by @cthoyt in #781
- 🏛️ 🥾 Improve metrics data structures by @cthoyt in #782
- 🎩 🎸 Class-Based Rank-Based Metrics by @mberr in #786
- ⚙️🌡️ Add more adjusted metrics by @cthoyt in #814
- 🪡🗜️ Refactor derived metrics by @mberr in #835
- 🔢 🐎 Update value range & docstring of adjusted metrics by @mberr in #823
- ➕🌍 Add option to add all default rank-based metrics by @mberr in #827
- 🪛💡 Fix RankBasedMetricResults.iter_rows by @mberr in #792
Prediction
Representations
- 🪛🔗 Change interactions' shape by @mberr in #736
- 🍝 🛰️ Update constrainer, initializer, and normalizer resolution by @mberr in #742
- 🦄 🔢 Only get representations for unique indices by @mberr in #743
- ⛔💠 Remove get in canonical shape by @mberr in #745
- ⏩🛌 Fix dtype forwarding in Embedding by @mberr in #746
- 📏🕳️ Move normalization to base representation by @mberr in #818
- ✏️📏 Unify representation module nomenclature by @mberr in #811
- ✨💤 Resolve Representations by @mberr in #803
Trackers
- Add loss kwargs to ResultTracker by @Rodrigo-A-Pereira in #741
- 🪡💻 Fix typo in ConsoleTracker.log_metrics by @mberr in #787
Fixes
- 🍌🍍 Fix ValueError during size probing on GPU machines by @mberr in #821
- 🪄➰ Fix device error in training loop by @mberr in #774
- ☕📱 Fix filterer's device by @mberr in #801
- ⛵💻 Make sure indices are moved to device by @mberr in #800
Documentation, Typing, and Packaging
- 🌊 👋 Goodbye to
setup.py
andMakefile
for building the docs by @cthoyt in #761 - 🌌 🥛 Update Constants and Types by @mberr in #754
- 🔫 🐈⬛ Update black by @cthoyt in #764
- 🐍 💪 Add Python 3.10 support by @cthoyt in #831
- 🥰 📙 Update argument passing and documentation by @cthoyt in #842
- 🍊 ⌨️ Typing Updates by @cthoyt in #760
- ⚙️📚 Fix HPO doc by @mberr in #820
- 📖🔪 Extend documentation on subbatching and slicing by @mberr in #810
Misc
- ✉️ ♻️ Add list of available configurations to usage message of reproduction by @mberr in #753
- 🦎⚡ Update class-resolver by @cthoyt in #775
Full Changelog: v1.7.0...v1.8.0
v1.7.0
New Models
- Add BoxE by @ralphabb in #618
- Add TripleRE by @mberr in #712
- Add AutoSF by @mberr in #713
- Add Transformer by @mberr in #714
- Add Canonical Tensor Decomposition by @mberr in #663
- Add (novel) Fixed Model by @cthoyt in #691
- Add NodePiece model by @mberr in #621
Updated Models
- Update R-GCN configuration by @mberr in #610
- Update ConvKB to ERModel by @cthoyt in #425
- Update ComplEx to ERModel by @mberr in #639
- Rename TranslationalInteraction to NormBasedInteraction by @mberr in #651
- Fix generic slicing dimension by @mberr in #683
- Rename UnstructuredModel to UM and StructuredEmbedding to SE by @cthoyt in #721
- Allow to pass unresolved loss to
ERModel
's__init__
by @mberr in #717
Representations and Initialization
- Add low-rank embeddings by @mberr in #680
- Add NodePiece representation by @mberr in #621
- Add label-based initialization using a transformer (e.g., BERT) by @mberr in #638 and #652
- Add label-based representation (e.g., to update language model using KGEM) by @mberr in #652
- Remove literal representations (use label-based initialization instead) by @mberr in #679
Training
- Fix displaying previous epoch's loss by @mberr in #627
- Fix kwargs transmission on MultiTrainingCallback by @Rodrigo-A-Pereira in #645
- Extend Callbacks by @mberr in #609
- Add gradient clipping by @mberr in #607
- Fix negative score shape for sLCWA by @mberr in #624
- Fix epoch loss for loss reduction != "mean" by @mberr in #623
- Add sLCWA support for Cross Entropy Loss by @mberr in #704
Inference
- Add uncertainty estimate functions via MC dropout by @mberr in #688
- Fix predict top k by @mberr in #690
- Fix indexing in
predict_*
methods when using inverse relations by @mberr in #699 - Move tensors to device for
predict_*
methods by @mberr in #658
Trackers
- Fix wandb logging by @mberr in #647
- Add multi-result tracker by @mberr in #682
- Add Python result tracker by @mberr in #681
- Update file trackers by @cthoyt in #629
Evaluation
- Store rank count by @mberr in #672
- Extend
evaluate()
for easier relation filtering by @mberr in #391 - Rename sklearn evaluator and refactor evaluator code by @cthoyt in #708
- Add additional classification metrics via
rexmex
by @cthoyt in #668
Triples and Datasets
- Add helper dataset with internal batching for Schlichtkrull sampling by @mberr in #616
- Refactor splitting code and improve documentation by @mberr in #709
- Switch
np.loadtxt
topandas.read_csv
by @mberr in #695 - Add binary I/O to triples factories @cthoyt in #665
Torch Usage
- Use
torch.finfo
to determine suitable epsilon values by @mberr in #626 - Use
torch.isin
instead of own implementation by @mberr in #635 - Switch to using
torch.inference_mode
instead oftorch.no_grad
by @sbonner0 in #604
Miscellaneous
- Add YAML experiment format by @mberr in #612
- Add comparison with reproduction results during replication, if available by @mberr in #642
- Adapt hello_world notebook to API changes by @dobraczka in #649
- Add testing configuration for Jupyter notebooks by @mberr in #650
- Add empty default
loss_kwargs
by @mali-git in #656 - Optional extra config for reproduce by @mberr in #692
- Store pipeline configuration in pipeline result by @mberr in #685
- Fix upgrade to sequence by @mberr in #697
- Fix pruner use in
hpo_pipeline
by @mberr in #724
Housekeeping
v1.6.0
This release is only compatible with PyTorch 1.9+. Because of some changes,
it's now pretty non-trivial to support both, so moving forwards PyKEEN will
continue to support the latest version of PyTorch and try its best to keep
backwards compatibility.
New Models
- DistMA (#507)
- TorusE (#510)
- Frequency Baselines (#514)
- Gated Distmult Literal (#591, thanks @Rodrigo-A-Pereira)
New Datasets
New Losses
- Double Margin Loss (#539)
- Focal Loss (#542)
- Pointwise Hinge Loss (#540)
- Soft Pointwise Hinge Loss (#540)
- Pairwise Logistic Loss (#540)
Added
- Tutorial in using checkpoints when bringing your own data (#498)
- Learning rate scheduling (#492)
- Checkpoints include entity/relation maps (#498)
- QuatE reproducibility configurations (#486)
Changed
- Reimplment SE (#521)
and NTN (#522) with new-style models - Generalize pairwise loss and pointwise loss hierarchies (#540)
- Update to use PyTorch 1.9 functionality (#489)
- Generalize generator strategies in LCWA (#602)
Fixed
- FileNotFoundError on Windows/Anaconda (#503, thanks @Hao-666)
- Fixed docstring for ComplEx interaction (#504)
- Make DistMult the default interaction function for R-GCN (#548)
- Fix gradient error in CompGCN buffering (#573)
- Fix splitting of numeric triples factories (#594, thanks @Rodrigo-A-Pereira)
- Fix determinism in spitting of triples factory (#500)
- Fix documentation and improve HPO suggestion (#524, thanks @kdutia)
v1.5.0
New Metrics
New Trackers
New Models
- QuatE (#367)
- CompGCN (#382)
- CrossE (#467)
- Reimplementation of LiteralE with arbitrary combination (g) function (#245)
New Negative Samplers
- Pseudo-typed Negative Sampler (#412)
Datasets
- Removed invalid datasets (OpenBioLink filtered sets; #439)
- Added WK3k-15K (#403)
- Added WK3l-120K (#403)
- Added CN3l (#403)
Added
- Documentation on using PyKEEN in Google Colab and Kaggle (#379, thanks @jerryIsHere)
- Pass custom training loops to pipeline (#334)
- Compatibility later for the fft module (#288)
- Official Python 3.9 support, now that PyTorch has it (#223)
- Utilities for dataset analysis (#16, #392)
- Filtering of negative sampling now uses a bloom filter by default (#401)
- Optional embedding dropout (#422)
- Added more HPO suggestion methods and docs (#446)
- Training callbacks (#429)
- Class resolver for datasets (#473)
Updated
- R-GCN implementation now uses new-style models and is super idiomatic (#110)
- Enable passing of interaction function by string in base model class (#384, #387)
- Bump scipy requirement to 1.5.0+
- Updated interfaces of models and negative samplers to enforce kwargs (#445)
- Reorganize filtering, negative sampling, and remove triples factory from most objects (#400, #405, #406, #409, #420)
- Update automatic memory optimization (#404)
- Flexibly define positive triples for filtering (#398)
- Completely reimplemented negative sampling interface in training loops (#427)
- Completely reimplemented loss function in training loops (#448)
- Forward-compatibility of embeddings in old-style models and updated docs on how to use embeddings (#474)
Fixed
- Regularizer passing in the pipeline and HPO (#345)
- Saving results when using multimodal models (#349)
- Add missing diagonal constraint on MuRE Model (#353)
- Fix early stopper handling (#419)
- Fixed saving results from pipeline (#428, thanks @kantholtz)
- Fix OOM issues with early stopper and AMO (#433)
- Fix ER-MLP functional form (#444)