The collection and processing of spatial data produces many files that need to be properly managed, from raw data like the photographs and coordinates in photobatches to the context volumes that have been transformed repeatedly through processing. It can be a struggle to manage these files efficiently and to produce appropriate metadata for them; the guidelines below have been designed to ensure proper record-keeping and the creation of archival versions of various files that will eventually be incorporated into the database or published. It is very important that these guidelines are followed to maintain consistency between years and between individual participants. If you encounter a situation that is not addressed by one of these guidelines, discuss it with a supervisor so that it can be integrated into the workflow to ensure consistency with all aspects of data management.
Managing Data for Photobatch Processing #
The processing of photobatches occurs primarily during excavation. A full protocol on processing photobatches and managing working files can be found here, but in the end the completed data should be moved or copied into a series of archival folders located at gygaia\3DSpatial\processed\[EA]. Below is a summary of the archival files that should exist:
- The photobatch folder, named with the photobatch number (e.g., 201607101246) should be moved into a sub-folder called “pointclouds.” Each photobatch folder should contain the following:
- A sub-folder called “sfm” containing the photographs and any Metashape files used for processing
- An .las file that contains the dense point cloud of the model
- A .txt or .csv file containing the coordinates for georeferencing the model, formatted in such a way that they are ready to important into a Metashape file
- A .txt file containing xyz point data from the dense point cloud
- A .pdf report exported from Metashape
- The orthomosaic (e.g., P201607101246) should be copied from gygaia\gisgps\kap\[year]\rasters\photoscan_exports (working folder) to gygaia\3DSpatial\processed\[EA]\orthomosaics (archival folder)
- The DEM (e.g., Z201607101246) should similarly be copied from the working folder in gisgps to the archival folder in 3DSpatial
Note: For pause contexts, in which top and bottom photobatches are taken with the same photobatch number, some additional file management will likely be necessary. These photobatch folders should contain sub-folders called “top” and “bottom,” each of which should contain all the necessary archival files described above.
In most cases, these archival files should be static, to serve as reference material or as the bases for further analysis (e.g., construction of context volumes). However, there are cases where photobatches need to be reprocessed. Reprocessing is most often done for older photobatches that have unresolved issues like holes or high error values, but might also be done to improve the quality of a model or its exports in advance of subsequent analyses or publication.
Depending on the nature of the reprocessing, you must follow some or all of the following guidelines to maintain data integrity.
- Do not delete any old data in the Metashape project prior to reprocessing. Instead, create a copy of the chunk that contains the model, and rename the copy with the current date. Any changes should be made in the new chunk, preserving the integrity of the original data.
- Note that it might be necessary to save a new copy of the Metashape file to complete certain processes due to changes in the software (e.g., the change from Photoscan to Metashape). If necessary, save the new file with the same name and do not delete the old file.
- Do not save any chunks from a different photobatch in the Metashape file for a given model. If you need to work with chunks from two different models in the same Metashape file, you should do so in a different file elsewhere on the server.
- Note that some “pause contexts” (in which a top and bottom photobatch were collected for the same context with the same number) were processed in such a way that the top and bottom models are in the same Metashape file. These need to be separated into different Metashape files when they are found. (See here for more information.)
- Be aware that derivative files from photobatches—most prominently the orthomosaics, DEMs, and .las files—are used for other tasks like drawing illustrations and building context volumes. If you think that the reprocessed model is a significant improvement on the original model, you should also create new exports to replace the now out-of-date archival files.
- Old files should be moved into a sub-folder called “superseded” within their current location; sometimes these “superseded” folders will already exist, sometimes you will need to create the folder yourself.
- In the rare situation where multiple superseded files will need to exist in the same folder, add an underscore and the date of the original export from the metadata of the file (e.g., _20160710) to the file names to distinguish the superseded files from one another.
- Record the date of the reprocessing, along with any notes and other necessary data, in the database. For example, if you are reprocessing a photobatch, you should include information on why it is being reprocessed and if you exported any files from the reprocessed model.
Managing Data for Context Volume Production #
The process of creating context volumes generally happens after excavation, often years after the original excavation of a given context. A full protocol on processing context volumes and managing working files can be found here, but below is a summary of the necessary files and their archiving.
- Point clouds representing the top and bottom of a given context are the starting point for producing a context volume. These can be found at gygaia\3DSpatial\processed\[EA]\[photobatch number]. Note that you need to ensure that you are using the most recent/highest quality .las file available; see above on data management for photobatches for more information on how to locate these files.
- Multiple files are created during the PCPro step of the protocol, but only one of these files is archived long term: the .txt file that records the settings used to produce the point cloud exports. These .txt files should be saved with the context number (e.g., 10) as the file name in gygaia\3DSpatial\processed\[EA]\context_volumes\pcpro_summaries.
- Multiple .ply files are created during the Cloud Compare step of the protocol; these are the clipped top and base point clouds and the final mesh, which is the file that we refer to as the “context volume.” These .ply files should be saved with the context number (e.g., 10) in the respective “top,” “base,” and “mesh” folders in gygaia\3DSpatial\processed\[EA]\context_volumes.
- Once a volume has been processed, the date of processing, analyst name, and any important notes should be recorded in the database.
Sometimes, it may be the case that a context volume will need to be reprocessed. If this is the case, follow the guidelines below to preserve data integrity.
- Do not delete any old data (meshes, tops/bases, PCPro summaries). Rather, move them to a “superseded” sub-folder in their current location.
- In the case that the volume for a context is reprocessed repeatedly and multiple superseded files exist in the same folder, add an underscore and the original creation date (e.g., _20240315) to the end of the file name to differentiate files from each other.
- Record the date of the reprocessing and other necessary information in the database.