jsonl.dump_archive¶
Write multiple JSON Lines files into a ZIP or TAR archive. Supports compressed TAR variants and custom serialization.
Function Signature¶
jsonl.dump_archive(
path,
data,
*,
opener=None,
text_mode=True,
dump_if_empty=True,
json_dumps=None,
**json_dumps_kwargs,
)
Parameters¶
| Parameter | Type | Default | Description |
|---|---|---|---|
path |
str |
(required) | Destination path for the archive file |
data |
Iterable[tuple[str, Iterable[Any]]] |
(required) | Iterable of (relative_path, items) tuples |
opener |
Callable or None |
None |
Custom function to open the given file paths |
text_mode |
bool |
True |
If False, write bytes instead of text |
dump_if_empty |
bool |
True |
If False, don't create empty files or an empty archive |
json_dumps |
Callable or None |
None |
Custom serialization function. Defaults to json.dumps |
**json_dumps_kwargs |
Additional keyword arguments passed to the serialization function |
Returns¶
str or None — Path to the created archive file, or None if no items were written and dump_if_empty is False.
Supported Archive Formats¶
| Extension | Format |
|---|---|
.zip |
ZIP |
.tar |
TAR (uncompressed) |
.tar.gz |
TAR + gzip |
.tar.bz2 |
TAR + bzip2 |
.tar.xz |
TAR + xz |
Warning
If the archive already exists at the given path, it will be overwritten.
Note
- Paths in the
dataargument must be relative. Absolute paths will raise aValueError. - If
datacontains multiple items for the same path, they are appended to the corresponding file within the archive.
Examples¶
Create a ZIP archive¶
import jsonl
data = [
("users.jsonl", [{"name": "Alice", "age": 30}, {"name": "Bob", "age": 25}]),
("orders.jsonl", [{"id": 1, "total": 99.90}, {"id": 2, "total": 45.00}]),
]
jsonl.dump_archive("archive.zip", data)
Create a compressed TAR archive¶
import jsonl
data = [
("users.jsonl", [{"name": "Alice", "age": 30}]),
("logs/activity.jsonl", [{"event": "login", "ts": "2025-01-01T00:00:00Z"}]),
]
jsonl.dump_archive("archive.tar.gz", data)
Compressed files inside the archive¶
Individual files within the archive can also be compressed:
import jsonl
data = [
("file1.jsonl", [{"name": "Alice", "age": 30}, {"name": "Bob", "age": 25}]),
("path/to/file2.jsonl.gz", [{"name": "Charlie", "age": 35}]),
("file1.jsonl", [{"name": "Eve", "age": 28}]), # Appends to file1.jsonl
]
jsonl.dump_archive("archive.zip", data)