Abstract
| - A wide range of molecular representations exist today, ranging from human-readable structural diagramsover line notations such as Wiswesser Line Notation (WLN) and SMILES to several dozen computer-readable file formats. Still, to encode molecular structures in a computer-readable way for inputting structuresin computer systems those formats are not the method of choice since they are not easily and faultlesslyreadable via optical recognition. In the present study a two-dimensional (PDF417) barcode representationof molecular structures in SMILES format is explored that enables the user to read and input molecularstructures into computer systems in a fully automated fashion. A Lempel-Ziv-Welch (LZW) based compressedversion of SMILES is suggested for cases where the size of the structure exceeds the storage capacity ofPDF417 barcodes. Alternatively, the compact ACS format may be employed as a structural representation.The input via barcodes is fast, practically error free due to the 2D barcodes used which employ error correctionand fully automatic. A Web application interface is developed which is able to interpret these barcodes andexport them as optimized 3D chemical structures. Applications of this representation range from keepingautomated storage systems to Web-based tracking systems of molecular samples. The National ChemicalLaboratory, Pune, employs 2D barcode encoded structures for in-house repository management, wherebarcodes can also be used for querying the database for similar or substructures of the query structure.
|