Skip to content
Snippets Groups Projects
Commit 0b268131 authored by Claude Becker's avatar Claude Becker
Browse files

storage: shorten by extracting general advice

parent 8114f7e6
No related branches found
No related tags found
No related merge requests found
General advice regarding storage
================================
### Regular housekeeping
Delete what is no longer needed. Document folder structure and file locations for your future self and others.
### Avoid too many files in the same directory
Don't store thousands of files in a single folder. This makes listing of the contents of the folder very slow. Create some folder hierarchy to split the files up for faster access.
### Combine small files into single tar
Don't store your data in thousands of files of a only few kilobytes each. Even the smallest file will always allocate the minimal block size, and therefore waste disk space. Working with many small files also comes with a big overhead of system calls and disk seek operations. Combine such results into larger files using `tar` (with or without compression).
### Prefer binary formats to plain text
Consider optimized binary formats (eg hdf5) to store your data - but make sure they're open and well documented so you can still read your data in 5 year's time. They use less disk space and allow faster input/output than plain text files. Common formats have libraries for most programming languages to ease read/write operations.
### File servers are not meant for backups
Do not use group shares to store backups of your machines. We provide a wide range of [[backup solutions|backups]]. If you're unsure which one of those is suitable for you, please get in touch.
...@@ -6,26 +6,7 @@ All owners of a D-PHYS [[account/]] have a certain amount of disk space on our f ...@@ -6,26 +6,7 @@ All owners of a D-PHYS [[account/]] have a certain amount of disk space on our f
General Advice General Advice
-------------- --------------
### Regular housekeeping Please follow our [[general advice]] regarding file management on our storage systems.
Delete what is no longer needed. Document folder structure and file locations for your future self and others.
### Avoid too many files in the same directory
Don't store thousands of files in a single folder. This makes listing of the contents of the folder very slow. Create some folder hierarchy to split the files up for faster access.
### Combine small files into single tar
Don't store your data in thousands of files of a only few kilobytes each. Even the smallest file will always allocate the minimal block size, and therefore waste disk space. Working with many small files also comes with a big overhead of system calls and disk seek operations. Combine such results into larger files using `tar` (with or without compression).
### Prefer binary formats to plain text
Consider optimized binary formats (eg hdf5) to store your data - but make sure they're open and well documented so you can still read your data in 5 year's time. They use less disk space and allow faster input/output than plain text files. Common formats have libraries for most programming languages to ease read/write operations.
### File servers are not meant for backups
Do not use group shares to store backups of your machines. We provide a wide range of [[backup solutions|backups]]. If you're unsure which one of those is suitable for you, please get in touch.
Access the files Access the files
---------------- ----------------
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment