This function attempts to assign a GOV identifier to each location in a GEDCOM file.
:param data: content of one GEDCOM file
:param resultQualityChecker: content for one line of the file "quality.csv"
:param filename: name of the file/source
:param miniGov: list of merged entries of the Mini-GOV
:return: list of dictionaries, which contains the identification for each location
"""
# copy the content to avoid compression
gedcomMetaInfo=resultQualityChecker
# definition of banned object types
# banned object types are object types in the GOV that should not be used for identification
# currently all ecclesiastical objects (up to and including 263), all legal objects (e.g. courts, from 263) and administrative divisions outside Germany that make allocation difficult (from 257)
# list of object types: http://gov.genealogy.net/type/list (retrieved on 8 December 2020)
# sometimes there is no English translation of the names of the object types
bannedObjectTypes=bannedObjects()
# "data" is compromised by the dataCleaner function and could no longer be used
# therefore a copy must be created that does not represent a pointer (that's why copy.copy is used)
initialGedcomData=copy.copy(data)
gedcomData=copy.copy(data)
# clean up every urbanonym in a GEDCOM file
# clean each row in gedcomData
forcleanCounterinrange(len(gedcomData)):
resultParser=qualitychecker.gedcomRowParser(gedcomData,cleanCounter)# seperate data of one row
tag=resultParser[2]# GEDCOM tag
behindTag=resultParser[3]# data behind GEDCOM tag
behindTag=behindTag.lower()# behindTag lower cases for better cleansing
# for urbanonyms:
iftag=="PLAC":
dataCleaned=dataCleaner(behindTag)
# overwrite the original GEDCOM line with the cleaned text