pycottas
pycottas is a library for working with compressed RDF files in the COTTAS format. COTTAS stores triples in a triple table in the Apache Parquet format. It is built on top of DuckDB and provides an HDT-like interface.
Features
- Compression and decompression of RDF files.
- Querying COTTAS files with triple patterns.
- RDFLib store backend for querying COTTAS files with SPARQL.
- Supports RDF datasets (quads).
- Can be used as a library or via command line.
COTTAS Files
COTTAS is based on COlumnar Triple TAble Storage with the Apache Parquet file format. A COTTAS file consists on a table with s, p, o, g columns representing triples (and named graphs):
- The s, p, o, g are filled with the RDF terms of the triples/quads.
- When a triple belongs to the default graph, g is NULL. If all the triples in the RDF dataset belong to the default graph, g can be omitted.
Licenses
pycottas is available under the Apache License 2.0.
The documentation is licensed under CC BY-SA 4.0.
Author
Citing
If you used pycottas in your work, please cite the ISWC paper:
@inproceedings{arenas2026cottas,
title = {{COTTAS: Columnar Triple Table Storage for Efficient and Compressed RDF Management}},
author = {Arenas-Guerrero, Julián and Ferrada, Sebastián},
booktitle = {Proceedings of the 24th International Semantic Web Conference},
year = {2026},
publisher = {Springer Nature Switzerland},
isbn = {978-3-032-09530-5},
pages = {313--331},
doi = {10.1007/978-3-032-09530-5_18},
}
