fix(folderparser): use csv.reader to handle RFC 4180 quoted fields in annotation CSVs#490
Conversation
|
|
… annotation CSVs Split-on-comma parsing silently corrupted file_name when filenames or labels contained commas inside quoted fields. Switch to csv.reader. Raw line text preserved for server reconstruction. Adds regression tests for both regular and _classes.csv paths. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
275b3ef to
51d148c
Compare
I have completed the agreement |
Problem
_parseAnnotationCSVusedline.split(",")to extract filenames. This silently corruptsfile_namewhen a filename contains commas inside quoted fields — valid per RFC 4180. Any dataset with such filenames fails to link images to their annotations.Fix
Replace manual split with
csv.reader(stdlib). Raw line text preserved verbatim for server upload reconstruction. File opened in text mode so\r\nnormalises to\n(backward compat with Windows CSVs unchanged).Tests
test_parse_csv_quoted_filename— regular CSV with comma-containing filenametest_parse_multilabel_csv_quoted_filename—_classes.csvwith comma-containing filename🤖 Generated with Claude Code