This function searches the Mini-GOV for location names.
:param minigov: list of merged entries of the Mini-GOV
:param value: name of the urbanonym
:return: List with two values (1. contains the line number in the Mini-GOV if the search result is unique, otherwise -1; 2. contains how many hits were found)
"""
# name of the column of the Mini-GOV to be searched
key="aktueller Name"
# initial base cleanup of the place name
# cut off everything from the first comma
try:
valueCleaned=value[:value.index(",")]
exceptValueError:
valueCleaned=value
# initialization of a list in which the line numbers of matching Mini-GOV entries are collected
hitsNumberList=[]
# initialization of a list in which the urbanonyms of matching Mini-GOV entries are collected
hitsUrbanonymList=[]
# Binary search algorithm for searching the Mini-GOV
# initial position is the center of the Mini-GOV
position=int(len(minigov)/2)
# position value of the previous iteration
# initially not 0, because this would lead to complex numbers in the formulas (roots of negative numbers)
previousPosition=len(minigov)
# search until the distance to the previous position is less than 10
while (previousPosition-position)notinrange(-10,10):
previousPositionCache=position# temporary storage, because position changes and the previous value prevoiusPosition is still needed
numberClusters=numberClusters+1# count the total number of clusters
# if there was a hit once, there can be no second hit, because placelist has only unique values; coordinates that occur twice are included twice in the calculation, because the whole part is executed multiple times
break
# calculate average coordinates of whole source
ifnumberOfCoordinates!=0:# non-negative condition
longitude=longitude/numberOfCoordinates
latitude=latitude/numberOfCoordinates
else:
longitude="NONE"
latitude="NONE"
# per GEDCOM file
# calculate number of different clusters
existingCluster=[]# list of assigned clusters
clusterMeanList=[]# list of averages of all clusters in a file for further processing
numberOfFinalCluster=0
# save only the numberClusters from the clusterlist
# via numberLatLong you can exclude small clusters; must be at least 1; must be at least 1
# only for clusters that really exist, therefore at least 1
ifnumberLatLong>=minimumClusterSize:# must go here, because otherwise the divider will be distorted and also clusters will be applied where there is no cluster entry (e.g. 23)
lat=lat/(numberLatLong)# non-negative
long=long/(numberLatLong)# non-negative
# the list is used for further calculations to determine/cluster locations
clusterMeanList.append([lat,long])
# counting of left clusters (cluster with the minimum size)