Skip to content
Snippets Groups Projects
Commit 2cbc9405 authored by Andreas Hellerschmied's avatar Andreas Hellerschmied
Browse files

First distribution v0.0.2

parent 63c2ecaa
No related branches found
No related tags found
No related merge requests found
# lifescale_gui # lifescale_utils
Data analysis tools for lifescale with GUI. Data analysis tools for lifescale with GUI.
# Installation and setup # Command line programms:
* **1. Create virtual environment (venv)**
* `python3 -m venv env`
* **2. Activate virtual environment**
* `source env/bin/activate`
* **3. Clone git repository to local machine**
* `git clone git@gitlab.com:hellerdev/lifescale_gui.git`
* `cd lifescale_gui`
* **4. Install required python packages using pip**
* `pip install -r requirements.txt`
## ls2csv
The program *ls2csv* reads the content of the xlsm files written by lifescale units, parses the data and writes thems to three csv
fieles:
* Masses_Vibrio_[run-name].csv: Contains the data series from the sheet AcquisitionIntervals.
* SampleMetadata_[run-name].csv: Data from the sheet PanelData.
* Summary_[run-name].csv: Contains the data from the sheet IntervalAnalysis.
## Installation issues on Ubuntu (20.04): ### Usage:
After just installing PyQt5 with pip3 the following error occurred when trying to actually run a PyQt GUI: qt.qpa.plugin: Could not load the Qt platform plugin "xcb" in "" even though it was found. * Conversion: `ls2csv -i [path and nale of xlsm file] -o [outpur directory]`
This issue was resolved by installing the QT dev tools (Designer, etc.): * Help: `ls2csv -h`
sudo apt-get install qttools5-dev-tools
# Test installation with setuptools
With command line interface.
* **1. Configure setup.py** # License and copyright
* Define entry points (*console_scripts*)
* **2. Activate virtual environment**
* e.g. `source env/bin/activate`
* **3. Run setup.py**
* `python3 setup.py develop`
## Using make Copyright (C) 2022 Andreas Hellerschmied (<heller182@gmx.at>)
`python3 setup.py develop`
# Run application on Windows and create a stand-alone Windows executable file: This program is free software: you can redistribute it and/or modify
TODO it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
# Comments on requirements.txt file: This program is distributed in the hope that it will be useful,
* Two entries can be deleted: but WITHOUT ANY WARRANTY; without even the implied warranty of
* -e git+git@gitlab.com:Heller182/grav.git@fe528c0769502e84a06be67a742032cacfd386df#egg=gravtools MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* pkg-resources==0.0.0 (created due a bug when using Linux, see: https://stackoverflow.com/questions/39577984/what-is-pkg-resources-0-0-0-in-output-of-pip-freeze-command) GNU General Public License for more details.
You should have received a copy of the GNU General Public License
# Create HTML documentation with sphinx: along with this program. If not, see <https://www.gnu.org/licenses/>.
Run make in the gravtools/doc directory:
* `>>>make html_doc`
# Guidelines and conventions
## Code style:
* Respect the PEP conventions on python coding!
* PEP 8 -- Style Guide for Python Code: https://www.python.org/dev/peps/pep-0008/
* The maximum line length is 120 characters
* Use **type hints**: https://www.python.org/dev/peps/pep-0484/
* Use docstrings according to the numpy standard: https://numpydoc.readthedocs.io/en/latest/format.html#docstring-standard
* They are useful to generate the documentation automatically
* Example: https://sphinxcontrib-napoleon.readthedocs.io/en/latest/example_numpy.html
* Comment code, if necessary!
* Use English language for the code, docstrings and comments
* German is allowed for user interfaces (GUI, command line), although English is preferred
## Documentation and docstring style
* The API reference is created with sphinx (https://www.sphinx-doc.org/).
* Docstrings have to follow the numpy standard, see: https://numpydoc.readthedocs.io/en/latest/format.html#docstring-standard
* Examples: https://sphinxcontrib-napoleon.readthedocs.io/en/latest/example_numpy.html
* Package documentation via docstring in __ini__.py files
* Module documentation via docstring at first lines of py-file
* Documentation of classes, class methods and functions via docstrings
## Command line interface and executable scripts
* The command line interface is realized via entry points (console_scripts) in setuptools (python packaging tool)
* Input arguments are handled with argparse
* The code is located in the command_line module (gravtools/command_line.py)
* Executable scripts are located in gravtools/scripts
## Dependancies
* Required python packages are listed in requirements.txt
* created with `>>>pip freeze > requirements.txt`
## Version control with GIT
* Gitlab repository: https://gitlab.com/Heller182/grav
* Branching model:
* **master** branch: Current release version
* **develop** branch: Current working version.
* All team members merge their feature branches into develop (merge request via gitlab)
* Make sure that the develop branch contains a fully functional version of the code!
* **feature** branches: Branches of develop for the implementation of new features and other changes.
* Code changes only in feature branches!
* Naming convention: feature_<description of change/feature>, e.g. feature_new_tide_model
* Use gitignore files to prevent any data files (except example files), IDE control files, compiled python code, etc. from being stored in the GIT repository
* Generally rule: Ignore everything in a directory and define explicit exceptions!
## Packaging and distribution
* With setuptools
"""LifeSclae GUI is a utility program for handling data output. """LifeSclae utils is a utility program for handling data output.
This program is free software: you can redistribute it and/or modify This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by it under the terms of the GNU General Public License as published by
...@@ -17,7 +17,7 @@ along with this program. If not, see <https://www.gnu.org/licenses/>. ...@@ -17,7 +17,7 @@ along with this program. If not, see <https://www.gnu.org/licenses/>.
Andreas Hellerschmied (heller182@gmx.at) Andreas Hellerschmied (heller182@gmx.at)
""" """
__version__ = '0.0.1' __version__ = '0.0.2'
__author__ = 'Andreas Hellerschmied' __author__ = 'Andreas Hellerschmied'
__git_repo__ = 'tba' __git_repo__ = 'tba'
__email__ = 'heller182@gmx.at' __email__ = 'heller182@gmx.at'
......
""" Command line interface of lifescale utils
Copyright (C) 2022 Andreas Hellerschmied <heller182@gmx.at>
"""
from lifescale.scripts.ls2csv import ls2csv as ls2csv_main
import argparse
import os
def is_file(filename):
"""Check, whether the input string is the path to an existing file."""
if os.path.isfile(filename):
return filename
raise argparse.ArgumentTypeError("'{}' is not a valid file.".format(filename))
def is_dir(pathname):
"""Check, whether the input string is a valid and existing filepath."""
if os.path.exists(pathname):
return pathname
raise argparse.ArgumentTypeError("'{}' is not a valid directory.".format(pathname))
def ls2csv():
"""Command line interface including argument parser for the lifescale2csv converter."""
parser = argparse.ArgumentParser(prog="ls2csv",
description="Covnersion from lifescale xlsm output to csv files",
epilog="The ls2csv converter loads and parses xslm files created by the lifescale "
"unit. It writes several csv files to the output directory that contain "
"extraced data from the input xlsm file in an easily readable way.",
formatter_class=argparse.ArgumentDefaultsHelpFormatter)
parser.add_argument("-i", "--input-xlsm", type=is_file, required=True, help="Path and name of the input xlsm file created by "
"lifescale.")
parser.add_argument("-o", "--out-dir", type=is_dir, required=True, help="Output directory for the CSV files.")
parser.add_argument("-nv", "--not-verbose", required=False, help="Disable command line status messages.",
action='store_true')
# parser.add_argument("--out-dir", type=is_dir, required=False,
# help="path to output directory", default=OUT_PATH)
args = parser.parse_args()
verbose = not args.not_verbose
return ls2csv_main(xlsm_filename=args.input_xlsm, oputput_dir=args.out_dir, verbose=verbose)
if __name__ == '__main__':
ls2csv()
import os
import platform
import json
DEFAULT_CONFIG = {
"mass_transformation": 0.00574,
"mass_cutoff": 20,
"peak_width_cutoff": 5,
"peak_distance_cutoff": 5,
"raw_data_folder": "~/research/lifescale_raw_data_test/development_raw_data_folder"
}
LINUX_PATH = "./dev_config.json"
WINDOWS_PATH = r"C:\Users\LifeScale\Documents\peak_caller_config\peak_caller_config.json"
def load_config():
if platform.system() == "Linux":
try:
with open(LINUX_PATH, "r") as f:
config = json.load(f)
return config, None
except FileNotFoundError as e:
config = DEFAULT_CONFIG
with open(LINUX_PATH, "w") as f:
json.dump(config, f)
return config, LINUX_PATH
elif platform.system() == "Windows":
try:
with open(WINDOWS_PATH, "r") as f:
config = json.load(f)
return config, None
except FileNotFoundError as e:
config = DEFAULT_CONFIG
with open(WINDOWS_PATH, "w") as f:
json.dump(config, f)
return config, WINDOWS_PATH
def configure_peakcaller(raw_data_folder, mass_transformation, mass_cutoff, peak_width_cutoff, peak_distance_cutoff, config, command):
print(locals())
new_config = {k:v for k,v in locals().items() if k != "config" and k != "command" and v is not None}
print(new_config)
old_config = locals()["config"]
merged_config = {k:new_config[k] if k in new_config else old_config[k] for k in old_config}
if platform.system() == "Linux":
with open(LINUX_PATH, "w") as f:
json.dump(merged_config, f)
elif platform.system() == "Windows":
with open(WINDOWS_PATH, "w") as f:
json.dump(merged_config, f)
return merged_config
{"mass_transformation": 0.00574, "mass_cutoff": 20.0, "peak_width_cutoff": 5.0, "peak_distance_cutoff": 5.0, "raw_data_folder": "/home/heller/pyProjects/gooey_lifescale/LSdata/raw_data"}
\ No newline at end of file
""" GUI application for processing LifeScale data.
copyright 2019 Joseph Elsherbini
all rights reserved
"""
import os
import struct
import json
import re
import datetime
from itertools import chain
from operator import itemgetter
import numpy as np
import pandas as pd
import scipy.signal
NOW = datetime.datetime.now().strftime("%Y%m%d%H%M%S")
def list_experiments(config):
raw_data_files = [f for f in os.listdir(os.path.expanduser(config["raw_data_folder"])) if re.search(r"(.+)_(\d{6})_(\d{6})", f) and os.path.splitext(f)[1] == ""]
unique_experiments = sorted(sorted(list(set([re.search(r"(.+)_(\d{6})_(\d{6})", f).groups() for f in raw_data_files])),
key=itemgetter(2), reverse=True), key=itemgetter(1), reverse=True)
return (["{} {}".format(e[0], get_date_time(e[1], e[2])) for e in unique_experiments], ["_".join(e) for e in unique_experiments])
def get_date_time(date, time):
fmt_string = "%m/%d/%Y %H:%M:%S"
return datetime.datetime(2000+int(date[0:2]), int(date[2:4]), int(date[4:6]), int(time[0:2]), int(time[2:4]), int(time[4:6])).strftime(fmt_string)
def call_peaks(experiment, output_folder, metadata_file, config, command):
update_now()
all_experiments= list_experiments(config)
exp_name = [e[1] for e in zip(all_experiments[0], all_experiments[1]) if e[0] == experiment][0]
exp_files = [os.path.join(os.path.expanduser(config["raw_data_folder"]), f) for f in os.listdir(os.path.expanduser(config["raw_data_folder"])) if exp_name in f and os.path.splitext(f)[1] == ""]
print(exp_name, exp_files)
peaks = write_peaks(exp_name, exp_files, output_folder, metadata_file, config)
write_summary(exp_name, peaks, output_folder)
# TODO write_plots(exp_name, peaks, output_folder, config)
write_config(exp_name, output_folder, config)
return config
def update_now():
global NOW
NOW = datetime.datetime.now().strftime("%Y%m%d%H%M%S")
def parse_metadata(metadata_file):
return pd.read_csv(metadata_file)[["Id", "Well"]]
def load_raw_data(exp_name, exp_files):
for f_path in exp_files:
m = re.search(r"(.+)_(\d{6})_(\d{6})_c(\d+)_v(\d+)", f_path)
exp_date, exp_time, exp_cycle, exp_measurement = m.group(2,3,4,5)
print(exp_name, exp_date, exp_time, exp_cycle, exp_measurement)
n_datapoints = int(os.path.getsize(f_path) / 8)
with open(f_path, "rb") as f:
content = f.read()
a = np.array(struct.unpack("d"*n_datapoints, content))[10:]
yield dict(zip(["exp_name", "exp_date", "exp_time", "exp_cycle", "exp_measurement", "data_array"],
[exp_name, exp_date, exp_time, exp_cycle, exp_measurement, a]))
def generate_peaks(measurement, config):
filtered_signal = scipy.signal.savgol_filter(measurement["data_array"], window_length=5, polyorder=3)
peaks, _ = scipy.signal.find_peaks(-filtered_signal, width=config["peak_width_cutoff"], prominence=config["mass_cutoff"]*config["mass_transformation"], distance=config["peak_distance_cutoff"])
masses = scipy.signal.peak_prominences(-filtered_signal, peaks)[0]*(1/config["mass_transformation"])
for peak, mass in zip(peaks, masses):
yield dict(zip(["exp_name", "exp_date", "exp_time", "exp_cycle", "exp_measurement", "event_index","event_mass"],
[measurement["exp_name"], measurement["exp_date"],measurement["exp_time"],measurement["exp_cycle"],measurement["exp_measurement"], peak, mass]))
def write_peaks(exp_name, exp_files, output_folder, metadata_file, config):
peaks = pd.DataFrame(chain.from_iterable([generate_peaks(measurement, config) for measurement in load_raw_data(exp_name, exp_files)]))
if metadata_file:
metadata = parse_metadata(metadata_file)
peaks = peaks.astype({'exp_measurement':'int32'}).merge(metadata.astype({'Id':'int32'}), how='left', left_on='exp_measurement', right_on='Id')
peaks["Well"] = ["".join([w[0],w[1:].zfill(2)]) for w in peaks["Well"]]
out_path = os.path.join(os.path.expanduser(output_folder), "{}_{}_peaks.csv".format(NOW, exp_name))
peaks.to_csv(out_path, index=False)
return peaks
def write_summary(exp_name, peaks, output_folder):
print(peaks.columns)
if "Well" in peaks.columns:
summary = peaks.groupby(["Well", "exp_cycle"])["event_mass"].describe()
else:
summary = peaks.groupby(["exp_measurement", "exp_cycle"])["event_mass"].describe()
out_path = os.path.join(os.path.expanduser(output_folder), "{}_{}_summary.csv".format(NOW, exp_name))
summary.to_csv(out_path)
def write_config(exp_name, output_folder, config):
output_path = os.path.join(os.path.expanduser(output_folder), "{}_{}_config.json".format(NOW, exp_name))
with open(output_path, "w") as f:
json.dump(config, f)
import os
import re
from datetime import datetime
from functools import partial
from operator import itemgetter
from gooey import Gooey, GooeyParser
import naive_peaks
import configure_peakcaller
DISPATCHER = {
"call_peaks": naive_peaks.call_peaks,
"config": configure_peakcaller.configure_peakcaller
}
def show_error_modal(error_msg):
""" Spawns a modal with error_msg"""
# wx imported locally so as not to interfere with Gooey
import wx
app = wx.App()
dlg = wx.MessageDialog(None, error_msg, 'Error', wx.ICON_ERROR)
dlg.ShowModal()
dlg.Destroy()
def add_call_peak_gui(subs, config):
p = subs.add_parser('call_peaks', prog='Call Mass Peaks', help='Get Mass Peaks from Raw Lifescale Data')
p.add_argument(
'experiment',
metavar='Choose an Experiment',
help='Choose the name of an experiment',
widget='Dropdown',
choices=naive_peaks.list_experiments(config)[0])
p.add_argument('output_folder', widget="DirChooser")
p.add_argument('--metadata_file', '-f', widget="FileChooser", help="If provided, convert vial ids to sample names. Should be the exported csv file called PanelData.csv.")
def add_config_gui(subs, config):
p = subs.add_parser('config', prog="Configure Program", help="Options to change where this program looks for data, and the calibration used for frequency to mass conversion.")
p.add_argument('--raw_data_folder', widget="DirChooser", help="currently {}".format(config["raw_data_folder"]))
p.add_argument('--mass_transformation', type=float, help='currently {} Hz/fg'.format(config["mass_transformation"]))
p.add_argument('--mass_cutoff', '-m', type=float, default=20, help='currently {} fg - minimum mass of the peak (minimum 5fg recommended)'.format(config["mass_cutoff"]))
p.add_argument('--peak_width_cutoff', '-w', type=float, default=5, help='currently {} - width cutoff for peaks - minimum datapoints looking larger than noise'.format(config["peak_width_cutoff"]))
p.add_argument('--peak_distance_cutoff', '-d', type=float, default=5, help='currently {} - distance cutoff for peaks - minimum datapoints between peaks'.format(config["peak_distance_cutoff"]))
@Gooey(program_name='Mass Peak Caller', image_dir='./images', required_cols=1)
def main():
current_config, file_not_found = configure_peakcaller.load_config()
if file_not_found:
show_error_modal("No configuration file found at {}.\nWrote default configuration to that location.\nContinuing with default config.".format(file_not_found))
parser = GooeyParser(description='Get Mass Peaks from Raw Lifescale Data')
subs = parser.add_subparsers(help='commands', dest='command')
add_call_peak_gui(subs, current_config)
add_config_gui(subs, current_config)
args = parser.parse_args()
opts = vars(args)
func = partial(DISPATCHER[args.command], config=current_config)
current_config = func(**opts)
if __name__ == '__main__':
main()
...@@ -206,7 +206,7 @@ class LSData: ...@@ -206,7 +206,7 @@ class LSData:
if item_not_nan_max_idx is np.nan: # No items that are not NaN! if item_not_nan_max_idx is np.nan: # No items that are not NaN!
settings_dict[row[0]] = None settings_dict[row[0]] = None
else: else:
tmp_list = short_row[:item_not_nan_max_idx].to_list() tmp_list = short_row.loc[:item_not_nan_max_idx].to_list()
num_items = len(tmp_list) num_items = len(tmp_list)
if num_items == 1: if num_items == 1:
settings_dict[row[0]] = tmp_list[0] settings_dict[row[0]] = tmp_list[0]
...@@ -399,7 +399,9 @@ class LSData: ...@@ -399,7 +399,9 @@ class LSData:
def export_csv_files(self, output_filepath, verbose=True, sort_by_time=False): def export_csv_files(self, output_filepath, verbose=True, sort_by_time=False):
"""Write CSV files to output directory""" """Write CSV files to output directory"""
if verbose:
print('Write output') print('Write output')
# Checks: # Checks:
if not os.path.exists(output_filepath): if not os.path.exists(output_filepath):
raise AssertionError(f'The output path does not exist: {output_filepath}') raise AssertionError(f'The output path does not exist: {output_filepath}')
...@@ -468,6 +470,7 @@ class LSData: ...@@ -468,6 +470,7 @@ class LSData:
else: else:
return f'Not data available yet.' return f'Not data available yet.'
def remove_space_from_column_names(df): def remove_space_from_column_names(df):
"""Removes white space from column names of input dataframe.""" """Removes white space from column names of input dataframe."""
col_names = df.columns col_names = df.columns
...@@ -477,13 +480,14 @@ def remove_space_from_column_names(df): ...@@ -477,13 +480,14 @@ def remove_space_from_column_names(df):
df.columns = col_names_corrected df.columns = col_names_corrected
return df return df
def row_to_list(row) -> list: def row_to_list(row) -> list:
"""Convert dataframe row to list and remove all trailing NaN values.""" """Convert dataframe row to list and remove all trailing NaN values."""
item_not_nan_max_idx = row.loc[~row.isna()].index.max() item_not_nan_max_idx = row.loc[~row.isna()].index.max()
if item_not_nan_max_idx is np.nan: # No items that are not NaN! if item_not_nan_max_idx is np.nan: # No items that are not NaN!
out_list = [] out_list = []
else: else:
out_list = row[:item_not_nan_max_idx].to_list() out_list = row.loc[:item_not_nan_max_idx].to_list()
return out_list return out_list
......
"""Converstion program from xlsm to csv
Copyright (C) 2022 Andreas Hellerschmied <heller182@gmx.at>"""
from lifescale.models.ls_data import LSData
def ls2csv(xlsm_filename, oputput_dir, verbose=True):
"""Convert lifescale output file (xlsm) to csv files."""
ls_data = LSData.from_xlsm_file(input_xlsm_filename=xlsm_filename, verbose=verbose)
ls_data.export_csv_files(oputput_dir, verbose=verbose)
"""Start the lifescale GUI from here!""" """Start the lifescale GUI from here!
Copyright (C) 2022 Andreas Hellerschmied <heller182@gmx.at>
"""
from lifescale.gui.gui_main import main from lifescale.gui.gui_main import main
......
...@@ -12,3 +12,19 @@ init: ...@@ -12,3 +12,19 @@ init:
# Convert *.ui files from Qt Designer to Python files: # Convert *.ui files from Qt Designer to Python files:
py_gui: py_gui:
pyuic6 -o lifescale/gui/MainWindow.py lifescale/gui/MainWindow.ui pyuic6 -o lifescale/gui/MainWindow.py lifescale/gui/MainWindow.ui
# Package test (install in current virtual environment, editable install with pip)
test_pack:
pip install -e .
# Uninstall test package
test_pack_uninstall:
pip uninstall gravtools
# Build package with setuptools (new version):
build:
python -m build
# Upload package to pypi.org
pypi_push:
twine upload --verbose dist/*
\ No newline at end of file
[build-system]
requires = ["setuptools", "wheel"]
build-backend = "setuptools.build_meta"
\ No newline at end of file
[metadata]
name = lifescale-utils
version = attr: lifescale.__version__
author = Andreas Hellerschmied
author_email = heller182@gmx.at
url = https://gitlab.com/hellerdev/lifescale_utils
description = Lifescale utility software.
long_description = file: README.md
long_description_content_type = text/markdown
keywords = Lifescale
license = GNU GPLv3
classifiers =
License :: OSI Approved :: GNU General Public License (GPL)
Programming Language :: Python :: 3
[options]
python_requires = >=3.6, <4
packages = find:
zip_safe = True
include_package_data = True
install_requires =
numpy
pandas
openpyxl
[options.entry_points]
console_scripts =
ls2csv = lifescale.command_line.command_line:ls2csv
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment