Convert Disk Usage to Hierarchical Markdown
Objective: to create a well-organized, nested Markdown report showing disk usage of directories and files.
Table of Contents
Workflow
- Data Collection
From your chosen base directory, rundu -shrecursively to generate disk usage reports, saving each top-level folder’s output as a.du.txtfile. - Conversion
The Python scriptduDirs2MD.pyreads each.du.txtfile and converts it into a clean, structured.mdfile with proper nesting. - Final Output
All.mdfiles are merged in alphabetical order into one master Markdown document.
Markdown Structure
#→ Top-level folder (highest level)##→ Level 2 subdirectories###→ Level 3 subdirectories####→ Level 4 and deeper- [ ]→ Leaf items (files or deepest folders)
Generic Example
Sample input (from du):
120G subdir1/ 45G subdir1/subdir2/ 15G subdir1/subdir2/subdir3/ 8G subdir1/subdir2/subdir3/fileA 5G subdir1/subdir2/subdir3/fileB 30G subdir1/subdir2/subdir4/ 75G subdir1/subdir5/
Output Markdown:
# 120G subdir1 ## 45G subdir1/subdir2 ### 15G subdir1/subdir2/subdir3 - [ ] 8G subdir1/subdir2/subdir3/fileA - [ ] 5G subdir1/subdir2/subdir3/fileB ### 30G subdir1/subdir2/subdir4 ## 75G subdir1/subdir5
Full Execution Plan
- Select base directory and generate the du script:
cd BaseDir; mkdir Z-GetSizes ls | awk -F/ '{print "du -h "$1" > Z-GetSizes/"$1".du.txt"}' > Z-GetSizes/duSubDirs.sh - Execute the script, this will be the single most time-consuming action:
chmod 755 Z-GetSizes/dusubdirs.sh ./Z-GetSizes/duSubDirs.sh
- Create Script to convert the du listing to MarkDown format:
cat <
duDirs2MD.py import sys from pathlib import Path from collections import defaultdict import re def parse_line(line): """Parse a du line: size and path""" line = line.strip() if not line: return None, None # Split on first whitespace match = re.match(r'^(\S+)\s+(.+)$', line) if match: size = match.group(1) path = match.group(2) return size, path return None, None def build_tree(lines): """Build a tree from path entries with sizes""" tree = {} path_to_size = {} all_paths = [] for line in lines: size, path = parse_line(line) if not size or not path: continue components = [c for c in path.split('/') if c] if not components: continue all_paths.append(components) path_to_size[tuple(components)] = size # Build nested dict for components in all_paths: current = tree for i, comp in enumerate(components): if comp not in current: current[comp] = { 'size': path_to_size.get(tuple(components[:i+1]), ''), 'children': {}, 'is_leaf': False } current = current[comp]['children'] # Mark leaves (nodes with no children) def mark_leaves(node): if not node['children']: node['is_leaf'] = True else: for child in node['children'].values(): mark_leaves(child) for root_node in tree.values(): mark_leaves(root_node) return tree, path_to_size def generate_markdown(tree, output_file): """Generate Markdown with proper nesting""" md_lines = [] def write_node(node_dict, name, depth, full_path): indent = '' size = node_dict.get('size', '') header = f"{size} {full_path}" if size else full_path if node_dict.get('is_leaf', False): # Use checkbox for leaves (films etc.) md_lines.append(f"- [ ] {header}") else: # Use heading for directories hashes = '#' * depth md_lines.append(f"{hashes} {header}") # Sort children for consistent output children = sorted(node_dict['children'].items()) for child_name, child_node in children: new_path = f"{full_path}/{child_name}" if full_path else child_name write_node(child_node, child_name, depth + 1, new_path) # Assume single root (the country) for root_name, root_node in tree.items(): write_node(root_node, root_name, 1, root_name) break # only one root expected with open(output_file, 'w', encoding='utf-8') as f: f.write('\n'.join(md_lines)) print(f"Generated: {output_file}") def main(): if len(sys.argv) < 2: print("Usage: python text2MD.py ") sys.exit(1) input_file = sys.argv[1] input_path = Path(input_file) if not input_path.exists(): print(f"Error: File {input_file} not found") sys.exit(1) # Output filename: replace .du.txt with .md (or just change extension) if input_path.suffix == '.txt' and '.du' in input_path.stem: output_name = input_path.stem.replace('.du', '') + '.md' else: output_name = input_path.stem + '.md' output_file = input_path.parent / output_name with open(input_file, 'r', encoding='utf-8') as f: lines = f.readlines() # reversing input file (as gives total dir sizes at end, we want first) tree, _ = build_tree(reversed(lines)) # reverse for correct top-down order generate_markdown(tree, output_file) if __name__ == "__main__": main() EOF - Test convert one du text file to MarkDown:
python duDirs2MD.py Z-GetSizes/TestDir.du.txt => Testdir.md
- Create script to convert all directories:
ls *.du.txt|awk '{print "time python text2md.py "$1}'|sh or: ls *.du.txt|awk '{print "time python text2md.py "$1}' > Do.All.SubDirs.sh - Execute script:
chmod 755 Do.All.SubDirs; ./Do.All.SubDirs
- Join all the MD output into 1:
printf '%s\n' *.md | sort | while IFS= read -r file; do cat "$file" echo "" # ensures clean separation between countries done > All.SubDirs.md - Open the final document, All.SubDirs.md, in your favourite Markdown viewer/editor (see options below).
Popular Markdown Editors & Viewers
- Obsidian — Excellent for adding content, knowledge base features, and plugins
- Zettlr — Great for viewing and academic/long-form work
- Typora — Beautiful distraction-free WYSIWYG experience
- Visual Studio Code — Powerful, free, with excellent Markdown support
- iA Writer — Minimalist, focus-oriented writing app
- MarkText — Free, open-source, clean interface
- Bear — Beautiful Markdown app (Apple ecosystem)
- Logseq — Outliner-style knowledge base
- Dillinger — Online Markdown editor
- StackEdit — Powerful browser-based Markdown editor