Skip to content
GitLab
Explore
Sign in
Register
Primary navigation
Search or go to…
Project
VD17-Dump
Manage
Activity
Members
Labels
Plan
Issues
Issue boards
Milestones
Wiki
Code
Merge requests
Repository
Branches
Commits
Tags
Repository graph
Compare revisions
Snippets
Build
Pipelines
Jobs
Pipeline schedules
Artifacts
Deploy
Releases
Package registry
Container Registry
Model registry
Operate
Environments
Terraform modules
Monitor
Incidents
Analyze
Value stream analytics
Contributor analytics
CI/CD analytics
Repository analytics
Model experiments
Help
Help
Support
GitLab documentation
Compare GitLab plans
Community forum
Contribute to GitLab
Provide feedback
Terms and privacy
Keyboard shortcuts
?
Snippets
Groups
Projects
Show more breadcrumbs
Hartmut Beyer
VD17-Dump
Commits
3833b411
Commit
3833b411
authored
3 years ago
by
beyer
Browse files
Options
Downloads
Patches
Plain Diff
Dokumentation
parent
6a917479
No related branches found
Branches containing commit
No related tags found
No related merge requests found
Changes
1
Hide whitespace changes
Inline
Side-by-side
Showing
1 changed file
README.md
+21
-15
21 additions, 15 deletions
README.md
with
21 additions
and
15 deletions
README.md
+
21
−
15
View file @
3833b411
...
...
@@ -132,30 +132,39 @@ Der Dump kann mit einem Python-Skript neu erzeugt werden. Hierfür muss zunächs
```
bash
sudo
git clone https://github.com/hbeyer/pylib
```
In dem dabei angelegten Verzeichnis
`pylib`
kann dann folgendes Skript mit Python 3 ausgeführt werden:
In dem dabei angelegten Verzeichnis
`pylib`
können dann Skripte mit Python 3 ausgeführt werden.
Herunterladen der PICA-XML-Dateien:
```
python
import
logging
import
zipfile
import
os
from
lib
import
sru
from
lib
import
isil
from
lib
import
recordlist
as
rl
from
lib
import
xmlreader
as
xr
from
lib
import
pica
logging
.
basicConfig
(
level
=
logging
.
INFO
)
#
Anpassen der Pfade erforderlich, die Ordner müssen bereit
s existieren
#
Ordner mus
s existieren
source_folder
=
f
"
{
Ordner
zum
Download
der
PICAXML
-
Dateien
}
"
target_folder
=
f
"
{
Ordner
zum
Anlegen
der
JSON
-
Dateien
}
"
# Download der PICA-XML-Dateien
(sollte nur einmal ausgeführt werden, da zeitaufwändig)
# Download der PICA-XML-Dateien
req
=
sru
.
Request_VD17
()
num
=
req
.
prepare
(
"
pica.bbg=(Aa* or Af* or Av*)
"
)
logging
.
info
(
f
"
Anzahl Datensätze:
{
req
.
numFound
}
"
)
req
.
download
(
"
source_folder
"
)
```
Generieren der JSON-Serialisierung:
```
python
import
logging
from
lib
import
pica
from
lib
import
isil
from
lib
import
recordlist
as
rl
from
lib
import
xmlreader
as
xr
logging
.
basicConfig
(
level
=
logging
.
INFO
)
# Festlegen der Ordner und der Größe der zu generierenden JSON-Dateien
source_folder
=
"
{Ordner mit den PICAXML-Dateien}/
"
target_folder
=
"
{Ordner zum Speichern der JSON-Dateien}/
"
size
=
1000
# Auslesen der PICA-XML-Dateien
reader
=
xr
.
DownloadReader
(
source_folder
,
"
record
"
,
"
info:srw/schema/5/picaXML-v1.0
"
)
content
=
[]
...
...
@@ -166,11 +175,10 @@ count = 0
for
node
in
reader
:
content
.
append
(
pica
.
RecordVD17
(
node
))
count
+=
1
if
count
>=
1000
:
if
count
>=
size
:
recl
=
rl
.
RecordList
(
content
)
fn
=
f
"
vd17-
{
str
(
setn
).
zfill
(
3
)
}
"
recl
.
to_json
(
target_folder
+
fn
)
fnn
.
append
(
fn
+
"
.json
"
)
content
=
[]
setn
+=
1
count
=
0
...
...
@@ -178,8 +186,6 @@ if content != []:
recl
=
rl
.
RecordList
(
content
)
fn
=
f
"
vd17-
{
str
(
setn
).
zfill
(
3
)
}
"
recl
.
to_json
(
target_folder
+
fn
)
fnn
.
append
(
fn
+
"
.json
"
)
```
> Written with [StackEdit](https://stackedit.io/).
This diff is collapsed.
Click to expand it.
Preview
0%
Loading
Try again
or
attach a new file
.
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Save comment
Cancel
Please
register
or
sign in
to comment