Skip to content
Snippets Groups Projects

Compare revisions

Changes are shown as if the source revision was being merged into the target revision. Learn more about comparing revisions.

Source

Select target project
No results found

Target

Select target project
  • phimuell/master-thesis
1 result
Show changes
Commits on Source (383)
Showing
with 319 additions and 79 deletions
# Master Thesis
This repro contains the master thesis of Philip Müller.
This Repository contains the master thesis of Philip Müller.
It is tracked with git, which also helps to organize it.
## Scorpia
However some parts of the project is not managed by git.
Because they would pollute the history.
For example not managed are:
- The generated data.
- Some Jupyter notebooks, however the main ones are checked in.
Most of the folders will contain a README.md file such as this one.
The file will describe the content of the folder, this also includes
a summary of the contents of sub folders.
As well as instructions to use certain programs.
## GIT
The project is tracked with git. It was used for _everything_.
Thus the LOG is a good source for information.
### IGNORE
Git can be configured to ignore certain files and whole directory.
For that the .gitignore files are used. The behaviour can be configured
on a per directory basis.
The project contains several ignore files, that should work well.
## Most Important Places
Here is a small list of the most important parts of the project and where they are located in.
- Report ~ "doc/report"
In this folder you will found the PDF of the report (declaration of originality is the
last page). You will also found a folder that contains the LaTeX source.
- Code ~ "code/scorpia_cpp"
In this folder you will found the root of the C++ codebase, meaning a CMake file.
## SCORPIA
This is the code name of the project.
Scorpia was a game reviewer in the 80s and 90s.
Scorpia was a legendary game reviewer in the 80s and 90s.
She was famous/feared for her harsh criticism of games she did not like.
## Folder Structure
Here is a list of all folder in this repro and what they contain.
You can find more READMEs in the folder.
### Folder Structure
These are the main folders of the project.
In each of them you will find a README.md file, that further describes
its content.
### PAPERS
#### PAPERS
This folder contains all papers that have a connection with the project.
### CODE
This folder contains everything that is connected to the code that is written in this thesis.
### DOC
This folder contains everything that is related to documentation.
This includes the manual (if one will be written), or reports.
However it also contains project management related documents.
#### CODE
This folder contains everything that is connected to the code that
has been written during this thesis.
### ANALYZE
This folder contains everything that deals with the analyzing and visualization of the data.
#### DOC
This folder contains everything that is related to documentation and related
documents. For example the final report and the diary is located there.
#### ANALYZE
This folder contains everything that deals with the analyzing and
visualization of the data.
......@@ -6,12 +6,19 @@ This folder contains everything that is related to analyzing and visualization o
Here is a list of folders that are located here.
They are created to separate the analyzing into several pieces.
## SIBYL
SIBYL is the main analyze framework.
It is named after a ancient Greek prophet.
She wished for immortality, but continued to age unable to die, she now seeks to die.
However Sibyl is also a very powerful computer in the anime Psycho-Pass, where she controls Japan with an iron fist.
For historical reason I name _all_ my analyze frameworks Sibyl.
It is named after a ancient Greek prophet. She wished for immortality, but continued to age unable to die,
she now seeks to die. However Sibyl is also a very powerful computer in the anime Psycho-Pass, where she
controls Japan with an iron fist. For historical reason I name _all_ my analyze frameworks Sibyl.
And actually this is Sibyl 2.0.
You will find instruction for starting SIBYL within this folder. However the Notebooks for the thesis are
not located in it, also the data are not located there, actually none of them are inside this repro.
## SIBYL_REPORT
In this folder you will found the notebooks that were written made for the thesis. I used to analyse the
data. However the raw data is not there.
# INFO
This folder contains the notebooks that I used for analysing the result of the simulation.
However I did not include the raw data because they are too large.
## Using the Notebooks
In order to use the notebooks you must first unpack them and copy them into a location that
Jupyter can found. A path suitable for this is "../sibyl/libSibyl/" which points to the
working directory of Jupyter, if DOMINATOR.sh was used to start Sibyl.
## Structure of the Folders
In the report we discuss 4 experiments, each of the subfolders will contain the result for
one of them. Which experiment is located in which folder should be clear. In addition we
also have included the notebook for the rotTC case, which mainly shows how the tools can
be used.
In each of the folder you will found two files.
### ZIP-File
This file contains a print out of the notebook, it is stored as LaTeX source file and must
first be compiled. It is important that the command "pdflatex" must be used for it, using
plain latex will not work.
It will generate a PDF file that shows the notebook.
### XZ-File
This is the actual notebook that was used to analyse the date. It is stored with the pictures
in it. To view it you must open it with Jupyter, instructions are given above.
It is important that the notebooks on their own are useless, because for further analysis
the raw data, which _can_ not be added to this repro must be available.
File added
File added
File added
File added
File added
File added
File added
File added
File added
File added
# Info
This folder contains SIBYL.
SIBYL is the main analyze framework for the thesis.
It is build on Jupyter notebooks, that allows to analyze the data in an interactive form.
To give better experience a set of classes is written to encapsulate things in an abstract way.
The center of this is the data base.
A data base encapsulates an HDF5 database and gives access to them in an uniform way.
## DOMINATOR
The script dominator.sh is provided to ease the use of SIBYL.
You can call it without options to create a new analyze session, a suitable template is loaded.
If you call it with one argument the argument will be used for a name instead a generic one.
Using the "load" keyword you can load an old analyze session.
If you want to run the server on a remote machine use the REMOTE argument as first argument to enable that.
### REMOTE Mode
The remote mode was introduced to run the server on a remote machine that has more resources.
For that the REMOTE argument is supplied.
It is recommended to run the script in a screen session.
However the script has a but (in earlier staged this was not present) that also in remote mode a browser is opened.
In remote mode it will most likely call LYNX or another terminal browser, I have no idea why, since I have disabled the browser.
So it could be hard to get the token to log into the server.
To get the token to log-in into the server you must use the following command:
This folder contains SIBYL. SIBYL is the main analyze framework used in this thesis.
It is build on Jupyter notebooks, that allows to analyze the data in an interactive manner.
You will not find an introduction to Python or Jupyter inside this project. Please use this
as an experimental ground or consult a tutorial on the internet.
This guide assumes that you have at least some experience with Jupyter.
Please also read the README in the LIBSIBYL folder.
## Dependencies
In order to be used, several external components are needed.
- Jupyter:
For running the notebooks.
- Python:
As runtime.
- NumPy & SciPy:
Numerical calculations.
- Matplotlib:
For plotting.
- h5py:
Accessing HDF5 files from withing Python.
- GCC:
Is needed for compiling some auxiliary code structures.
- git:
Needed for the internal version control.
## Design
SIBYL is actually only a collection of functions that was written to automatize common tasks
that are needed frequently.
SIBYL is written around the data base. As it was described elsewhere the simulation states are
stored using HDF5 files. The data base allows to access the data in a easy way.Its main job
is basically to compose the path at which the dataset was stored and load it from disc.
### CSIBYL
Some computationally intense parts are written in C. Python is able to dynamically load library
and use the functions inside them. We exploit this to provide the functionality. The DOMINATOR
script will take care of everything. However it will deactivate asserts.
## DOMINATOR.sh
The script "dominator.sh" is provided to ease the use of SIBYL. It will perform all needed
initialization tasks. You should only use SIBYL through this script, otherwise it may not
work correctly.
DOMINATOR.sh also accepts a "--help" falg which will print out some basic informations.
### Working Directory
The script will enter the subfolder LIBSIBYL. Jupyter is also called from this folder.
Thus all paths are relative to that folder and _not_ to this one.
### Usage
There are two ways of usage. As a novice it is recommended to use the first one and call
the script without any arguments.
#### Calling Without Arguments
The script will enter the LIBSIBYL folder and perform the setup steps. It will then create
a new notebook for the current session. For this a template file is used.
On most machines it should open your browser and load the notebook.
If not, see below how to get the URL.
#### REMOTE Mode
This mode is activated by supplying the argument "REMOTE" to the script. It is intended to
run the notebook on a different PC, this is especially useful if your machine is not that
powerful. You may have to modify the script and change the IP address and the port as well.
You must set it to the IP address of the machine you are running on.
It is highly recommended to run the script in a screen session, and the script will remind
you about this. However at least on my machine, I was not able to suppress the opening of
a browser, so it may be possible that LYNX is opened, or an error happens.
In order to get the URL to access the notebook you have to enter the following command:
jupyter-notebook list
which will output the list of all servers together with the tokens.
## Storage of Notebooks and HDF5
See the subfolder for an explanation how the storage works.
## Folders
Here is a small list of folders that are used.
### LIBSIBYL
This folder contains everything that is related to SIBYL.
It contains the Python code and the analyzing notebooks.
All the magic happens in that folder.
Before doing anything else the dominator.sh script will enter this folder.
......
......@@ -25,6 +25,38 @@ NO_BROWSER_COM=""
# This variable can be used to determine, if we should pull/push to the notebook repro
DO_PULLING="1"
## Make some test to screen the dependency
if hash gcc &>/dev/null
then
NOTHING=1
else
echo "Could not detect GCC on your system."
echo " GCC is needed for CSIBYL."
echo " ABORT."
exit 70
fi
if hash jupyter &>/dev/null
then
NOTHING=1
else
echo "Could not detect JUPYTER on your system."
echo " JUPYTER is needed for SIBYL."
echo " ABORT."
exit 71
fi
if hash git &>/dev/null
then
NOTHING=1
else
echo "Could not detect GIT on your system."
echo " GIT is needed for SIBYL."
echo " ABORT."
exit 71
fi
#####################
......@@ -34,7 +66,7 @@ then
echo "It will set up all the external work and create set up a new notebook that will be used."
echo "It supports some options, they must apear in the order they yre presented here."
echo " "
echo -e "--no-pull This option will skipt the pulling from teh NB repro (not recomended)."
echo -e "--no-pull This option will skipt the pulling from the NB repro (not recomended)."
echo -e "REMOTE [ip: IP] Will create a server that listen to [IP]:${PORT_TO_LISTEN}."
echo -e " This must be the ip address of this PC, defaults to 192.168.1.120."
echo " "
......@@ -43,6 +75,9 @@ then
echo "notebook. In order to get the token to log in use \"jupyter-notebook list\" to get"
echo "a list of all running servers, with the token."
echo " "
echo "The script uses a primitve way to determine if an the NB folder \"${NOTEBOOK_FNAME}\","
echo "contains a git repro or not. All git commands will only be performed if it is considered one."
echo " "
exit 0
fi
......@@ -73,7 +108,10 @@ fi
if [ ! -d "${SIBYL_FOLDER}/${NOTEBOOK_FNAME}" ]
then
echo "No notebook folder is present, creating it."
echo "It is recomended to create a git folder in it."
mkdir -p "${SIBYL_FOLDER}/${NOTEBOOK_FNAME}"
sleep 30
fi
#
......@@ -110,6 +148,7 @@ then
IP_TO_LISTEN="192.168.1.120"
PORT_TO_LISTEN="8888"
# ",," makes to all lower case
if [ $# -ge 1 ] && [ "${1,,}" == "ip:" ]
then
if [ $# -lt 1 ]
......@@ -171,15 +210,24 @@ fi
# Pull the repro of the Notebooks
if [ ${DO_PULLING} -ne 0 ]
then
echo "Start pulling from the Notebook-Repro"
git -C "${NOTEBOOK_FNAME}" pull --rebase --ff-only
if [ $? -ne 0 ]
# Primitive test
if [ ! -e "${NOTEBOOK_FNAME}/.git" ]
then
echo "Failed to pull the changes from the notebook repro."
echo "Abort"
echo "Pulling was selected but \"${NOTEBOOK_FNAME}\" does not seam to contain a git repro."
echo " Since pulling is activated by default, try with --no-pull again."
sleep 30
else
echo "Start pulling from the Notebook-Repro"
git -C "${NOTEBOOK_FNAME}" pull --rebase --ff-only
if [ $? -ne 0 ]
then
echo "Failed to pull the changes from the notebook repro."
echo "Abort"
exit 1
exit 1
fi
fi
fi # end: pulling from repro
......@@ -234,27 +282,44 @@ echo "Actions are no longer requiered."
echo "============================="
echo " "
# Now we add all the change of the notebooks to the repro
# We will add all changes, because other notebooks where changed.
# We use the -C option to virtually change the folder
if [ "${DO_PULLING}" -ne 0 ]
# Test if a git folder is present if so, add the changes to the folder
if [ -e "${NOTEBOOK_FNAME}/.git" ]
then
echo "Publishing the changes on the notebooks."
echo "Add and commit the changes to the repro."
DATE2="$(date +%Y-%m-%d\ %R)"
git -C "${NOTEBOOK_FNAME}" add .
if [ $? -ne 0 ]
then
echo "Git add failed."
echo "Abort."
exit 1
fi
# Commit it, we assume that this commit will work.
git -C "${NOTEBOOK_FNAME}" commit --allow-empty -m "Automatical commit of the \"${DATE}\" series. Commited at ${DATE2}"
fi
echo " "
echo "Pushing the changes to the remote"
git -C "${NOTEBOOK_FNAME}" push
if [ $? -ne 0 ]
# Now we add all the change of the notebooks to the repro
# We will add all changes, because other notebooks where changed.
# We use the -C option to virtually change the folder
if [ "${DO_PULLING}" -ne 0 ]
then
# Primitive test
if [ -e "${NOTEBOOK_FNAME}/.git" ]
then
echo "Git push failed."
echo "Abort."
echo "Pushing the changes to the remote"
exit 1
git -C "${NOTEBOOK_FNAME}" push
if [ $? -ne 0 ]
then
echo "Git push failed."
echo "Abort."
exit 1
fi
else
echo "Pushing was enabled, but no repro detected."
sleep 5
fi
fi # end: push
......
# Ignore the folder with the notebooks
# Ignore the folder with the notebooks, the folder will contain
# a sperate git repro, allowing to seperate the main repro from it.
yNoteBooks/
# Ignore all Python/Jupyter related folders
......@@ -19,5 +20,13 @@ __pycache__/
*.jpg
*.png
# This is an internal folder
AUSTAUSCH
# Ignorin the files that are used by the CSIBYL runtimes
cLibSibyl_panic.log
cLibSibyl_panic_pid*.log
cLibSibyl_out.txt
cLibSibyl_out_pid*.txt
......@@ -2,27 +2,62 @@
This folder contains the implementation of SIBYL.
Currently these are some Python scripts that allows easy access to the stored data.
There is a support library written in C to perform some tasks that are not suited for Python.
I know that then everything, except the plotting, should have been done in C, but I lack the time to do that.
I know that everything, except the plotting, should have been done in C, but I lack the time to do that.
There is a script, a folder above, called DOMINATOR.sh. This script can be called to setup a notebook.
It will start the server, create a file and so on.
Note that in order for the library to work, the notebook must be located in this folder.
The script will create a notebook in the folder YNOTEBOOKS and create a symlink in this folder.
Which also works.
This will prevent the main repro from getting polluted.
Each time the DOMINATOR script is run, it will create a new notebook. As template it will use
the "Sibyl.ipynb" file. A copy will be created and it will be renamed.
## GITIGNORE
Note this folder contains a git ignore file. That adds some more ignore patterns.
Most importantly it will add "*.ipynb" to the list, this means that all files with
the extension ".ipynb" are ignored, which are all notebooks.
## FOLDERS
Here is a list of folders that are important
### YNOTEBOOKS
This folder will contains all notebooks, they will not be controlled by the normal version control system,
since they occupy a large amount of space. DOMINATOR.sh will create symlinks to put them into this folder.
Note that the repro is accessable by the link "ssh://git@localhost:51551/phimuell/master_thesis_yNoteBooks.git".
This is a private repro, if you need access, please get in contact with me.
This folder should contain all notebook files that are used and made during the project.
git is instructed to ignore this folder, this means that all changes are not visible to
git. The main idea was to not pollute the main repro, with noise generated by analyzing.
It is recommended, and DOMINATOR.sh does this, to create a symlink in this folder that
points to the notebook. Since git knows symlink it does not care about the file at the
other end.
However the folder contains the file ".gitkeep" that is tracked by the main git repro.
This has the effect that the folder is created upon cloning.
#### Actual Data
When you clone the main repro, the YNOTEBOOK folder will be empty, with the exception
of the .gitkeep file. As it was mentioned, you can request this data, for a certain
time at least. However the main repro contains a memento of the notebooks that were
used for the report, you can find them at "${GIT_ROOT}/analyze/Sibyl__REPORT".
You will need to copy the data to this "." folder.
However without the raw data, you can not analyse them, but you can view them.
#### Internal GIT-Repro
The folder is not tracked, thus no work is tracked as well. However at some point it
should be possible to track changes, but without polluting the main repro.
For this reason the folder contains, or should, its own git repro, that is responsible
managing it.
Currently this is a private repro, that can be made available upon request.
We would also like to note, that DOMINATOR.sh is aware of this and will use the repro.
### HDF5
This folder contains the actual HDF5 files that are used.
It contains an ignore file that will ignore anything in the folder.
This folder should contain all hdf5 files. However if you get the folder, it will be
empty, because the actual data needs several terabit of storage, so it is not included.
However all notebooks assumes that the data is there. It contains a .gitignore file, that
ignores everything inside it.
There are script to create snymlinks in that folder to place files in it.
### LIBSIBYL
......
yNoteBooks/Sibyl_studyOfSimpleBall1.ipynb
\ No newline at end of file
# This is teh Makefile of Sibyl 2.0.
# Since no C-code is build this file does nothing
# This is the make fo CSibly 2.0
# All C-Code is located in teh subdirectory cLibSibyl.
# This Makefill will call that Makefile and create a
# symbolic link to the shared object.
#
# It is important to note, that using this make will,
# be default, disable the debugging utilities of the
# code. Using the other Makefile does not do that.
# This will make all always out of data
.PHONY : all
all:
cd cLibSibyl ; make
cd cLibSibyl ; make NO_DEBUG=1
ln -s -f cLibSibyl/libcLibSibyl.so .