First distribution v0.0.2

2cbc9405 · Andreas Hellerschmied · 63c2ecaa · 2cbc9405 · 2cbc9405 · 2cbc9405
Commit 2cbc9405 authored Dec 28, 2022 by Andreas Hellerschmied
--- a/README.md
+++ b/README.md
-# lifescale_gui
+# lifescale_utils
 Data analysis tools for lifescale with GUI.
-# Installation and setup
+# Command line programms:
-* **1.  Create virtual environment (venv)**
-  * `python3 -m venv env`
-* **2. Activate virtual environment**
-  * `source env/bin/activate`
-* **3. Clone git repository to local machine**
-  * `git clone git@gitlab.com:hellerdev/lifescale_gui.git`
-  * `cd lifescale_gui`
-* **4. Install required python packages using pip**
-  * `pip install -r requirements.txt`
+## ls2csv
+The program *ls2csv* reads the content of the xlsm files written by lifescale units, parses the data and writes thems to three csv 
+fieles:
+  * Masses_Vibrio_[run-name].csv: Contains the data series from the sheet AcquisitionIntervals.
+  * SampleMetadata_[run-name].csv: Data from the sheet PanelData.
+  * Summary_[run-name].csv: Contains the data from the sheet IntervalAnalysis.
-## Installation issues on Ubuntu (20.04):
+### Usage:
-After just installing PyQt5 with pip3 the following error occurred when trying to actually run a PyQt GUI: qt.qpa.plugin: Could not load the Qt platform plugin "xcb" in "" even though it was found.
+  * Conversion: `ls2csv -i [path and nale of xlsm file] -o [outpur directory]`
-This issue was resolved by installing the QT dev tools (Designer, etc.): 
+  * Help: `ls2csv -h`
-sudo apt-get install qttools5-dev-tools
-# Test installation with setuptools 
-With command line interface.
-* **1. Configure setup.py**
+# License and copyright
-  * Define entry points (*console_scripts*)
-* **2. Activate virtual environment**
-  * e.g. `source env/bin/activate`
-* **3. Run setup.py**
-  * `python3 setup.py develop`
-## Using make
+Copyright (C) 2022  Andreas Hellerschmied (<heller182@gmx.at>)
-`python3 setup.py develop`
-# Run application on Windows and create a stand-alone Windows executable file:
+This program is free software: you can redistribute it and/or modify
-TODO
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation, either version 3 of the License, or
+(at your option) any later version.
-# Comments on requirements.txt file:
+This program is distributed in the hope that it will be useful,
-* Two entries can be deleted:
+but WITHOUT ANY WARRANTY; without even the implied warranty of
-  * -e git+git@gitlab.com:Heller182/grav.git@fe528c0769502e84a06be67a742032cacfd386df#egg=gravtools
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
-  * pkg-resources==0.0.0 (created due a bug when using Linux, see: https://stackoverflow.com/questions/39577984/what-is-pkg-resources-0-0-0-in-output-of-pip-freeze-command)
+GNU General Public License for more details.
+You should have received a copy of the GNU General Public License
-# Create HTML documentation with sphinx:
+along with this program.  If not, see <https://www.gnu.org/licenses/>.
-Run make in the gravtools/doc directory: 
-* `>>>make html_doc`
-# Guidelines and conventions
-## Code style:
-* Respect the PEP conventions on python coding!
-  * PEP 8 -- Style Guide for Python Code: https://www.python.org/dev/peps/pep-0008/
-* The maximum line length is 120 characters
-* Use **type hints**: https://www.python.org/dev/peps/pep-0484/
-* Use docstrings according to the numpy standard: https://numpydoc.readthedocs.io/en/latest/format.html#docstring-standard
-  * They are useful to generate the documentation automatically
-  * Example: https://sphinxcontrib-napoleon.readthedocs.io/en/latest/example_numpy.html
-* Comment code, if necessary!
-* Use English language for the code, docstrings and comments
-  * German is allowed for user interfaces (GUI, command line), although English is preferred
-## Documentation and docstring style
-* The API reference is created with sphinx (https://www.sphinx-doc.org/).
-* Docstrings have to follow the numpy standard, see: https://numpydoc.readthedocs.io/en/latest/format.html#docstring-standard
-  * Examples: https://sphinxcontrib-napoleon.readthedocs.io/en/latest/example_numpy.html
-* Package documentation via docstring in __ini__.py files
-* Module documentation via docstring at first lines of py-file
-* Documentation of classes, class methods and functions via docstrings
-## Command line interface and executable scripts
-* The command line interface is realized via entry points (console_scripts) in setuptools (python packaging tool)
-  * Input arguments are handled with argparse
-  * The code is located in the command_line module (gravtools/command_line.py)
-* Executable scripts are located in gravtools/scripts
-## Dependancies
-* Required python packages are listed in requirements.txt
-  * created with `>>>pip freeze > requirements.txt`
-## Version control with GIT
-* Gitlab repository: https://gitlab.com/Heller182/grav
-* Branching model:
-  * **master** branch: Current release version
-  * **develop** branch: Current working version. 
-    * All team members merge their feature branches into develop (merge request via gitlab)
-    * Make sure that the develop branch contains a fully functional version of the code!
-  * **feature** branches: Branches of develop for the implementation of new features and other changes.
-    * Code changes only in feature branches!
-    * Naming convention: feature_<description of change/feature>, e.g. feature_new_tide_model
-* Use gitignore files to prevent any data files (except example files), IDE control files, compiled python code, etc. from being stored in the GIT repository
-  * Generally rule: Ignore everything in a directory and define explicit exceptions!
-## Packaging and distribution
-* With setuptools 
--- a/lifescale/__init__.py
+++ b/lifescale/__init__.py
-"""LifeSclae GUI is a utility program for handling data output.
+"""LifeSclae utils is a utility program for handling data output.
 This program is free software: you can redistribute it and/or modify
 it under the terms of the GNU General Public License as published by
@@ -17,7 +17,7 @@ along with this program.  If not, see <https://www.gnu.org/licenses/>.
    Andreas Hellerschmied (heller182@gmx.at)
 """
-__version__ = '0.0.1'
+__version__ = '0.0.2'
 __author__ = 'Andreas Hellerschmied'
 __git_repo__ = 'tba'
 __email__ = 'heller182@gmx.at'

--- a/lifescale/mass_peak_caller/__init__.py
+++ b/lifescale/mass_peak_caller/__init__.py
--- a/lifescale/command_line/command_line.py
+++ b/lifescale/command_line/command_line.py
+""" Command line interface of lifescale utils
+Copyright (C) 2022  Andreas Hellerschmied <heller182@gmx.at>
+"""
+from lifescale.scripts.ls2csv import ls2csv as ls2csv_main
+import argparse
+import os
+def is_file(filename):
+    """Check, whether the input string is the path to an existing file."""
+    if os.path.isfile(filename):
+        return filename
+    raise argparse.ArgumentTypeError("'{}' is not a valid file.".format(filename))
+def is_dir(pathname):
+    """Check, whether the input string is a valid and existing filepath."""
+    if os.path.exists(pathname):
+        return pathname
+    raise argparse.ArgumentTypeError("'{}' is not a valid directory.".format(pathname))
+def ls2csv():
+    """Command line interface including argument parser for the lifescale2csv converter."""
+    parser = argparse.ArgumentParser(prog="ls2csv",
+                                     description="Covnersion from lifescale xlsm output to csv files",
+                                     epilog="The ls2csv converter loads and parses xslm files created by the lifescale "
+                                            "unit. It writes several csv files to the output directory that contain "
+                                            "extraced data from the input xlsm file in an easily readable way.",
+                                     formatter_class=argparse.ArgumentDefaultsHelpFormatter)
+    parser.add_argument("-i", "--input-xlsm", type=is_file, required=True, help="Path and name of the input xlsm file created by "
+                                                                 "lifescale.")
+    parser.add_argument("-o", "--out-dir", type=is_dir, required=True, help="Output directory for the CSV files.")
+    parser.add_argument("-nv", "--not-verbose", required=False, help="Disable command line status messages.",
+                        action='store_true')
+    # parser.add_argument("--out-dir", type=is_dir, required=False,
+    #                     help="path to output directory", default=OUT_PATH)
+    args = parser.parse_args()
+    verbose = not args.not_verbose
+    return ls2csv_main(xlsm_filename=args.input_xlsm, oputput_dir=args.out_dir, verbose=verbose)
+if __name__ == '__main__':
+    ls2csv()
--- a/lifescale/mass_peak_caller/configure_peakcaller.py
+++ b/lifescale/mass_peak_caller/configure_peakcaller.py
-import os
-import platform
-import json
-DEFAULT_CONFIG = {
-    "mass_transformation": 0.00574,
-    "mass_cutoff": 20,
-    "peak_width_cutoff": 5,
-    "peak_distance_cutoff": 5,
-    "raw_data_folder": "~/research/lifescale_raw_data_test/development_raw_data_folder"
-}
-LINUX_PATH = "./dev_config.json"
-WINDOWS_PATH = r"C:\Users\LifeScale\Documents\peak_caller_config\peak_caller_config.json"
-def load_config():
-    if platform.system() == "Linux":
-        try:
-            with open(LINUX_PATH, "r") as f:
-                config = json.load(f)
-            return config, None
-        except FileNotFoundError as e:
-            config = DEFAULT_CONFIG
-            with open(LINUX_PATH, "w") as f:
-                json.dump(config, f)
-            return config, LINUX_PATH
-    elif platform.system() == "Windows":
-        try:
-            with open(WINDOWS_PATH, "r") as f:
-                config = json.load(f)
-            return config, None
-        except FileNotFoundError as e:
-            config = DEFAULT_CONFIG
-            with open(WINDOWS_PATH, "w") as f:
-                json.dump(config, f)
-            return config, WINDOWS_PATH
-def configure_peakcaller(raw_data_folder, mass_transformation, mass_cutoff, peak_width_cutoff, peak_distance_cutoff, config, command):
-    print(locals())
-    new_config = {k:v for k,v in locals().items() if k != "config" and k != "command" and v is not None}
-    print(new_config)
-    old_config = locals()["config"]
-    merged_config = {k:new_config[k] if k in new_config else old_config[k] for k in old_config}
-    if platform.system() == "Linux":
-        with open(LINUX_PATH, "w") as f:
-            json.dump(merged_config, f)
-    elif platform.system() == "Windows":
-        with open(WINDOWS_PATH, "w") as f:
-            json.dump(merged_config, f)
-    return merged_config
--- a/lifescale/mass_peak_caller/dev_config.json
+++ b/lifescale/mass_peak_caller/dev_config.json
-{"mass_transformation": 0.00574, "mass_cutoff": 20.0, "peak_width_cutoff": 5.0, "peak_distance_cutoff": 5.0, "raw_data_folder": "/home/heller/pyProjects/gooey_lifescale/LSdata/raw_data"}
\ No newline at end of file
--- a/lifescale/mass_peak_caller/naive_peaks.py
+++ b/lifescale/mass_peak_caller/naive_peaks.py
-""" GUI application for processing LifeScale data.
-copyright 2019 Joseph Elsherbini
-all rights reserved
-"""
-import os
-import struct
-import json
-import re
-import datetime
-from itertools import chain
-from operator import itemgetter
-import numpy as np
-import pandas as pd
-import scipy.signal
-NOW = datetime.datetime.now().strftime("%Y%m%d%H%M%S")
-def list_experiments(config):
-    raw_data_files = [f for f in os.listdir(os.path.expanduser(config["raw_data_folder"])) if re.search(r"(.+)_(\d{6})_(\d{6})", f) and os.path.splitext(f)[1] == ""]
-    unique_experiments = sorted(sorted(list(set([re.search(r"(.+)_(\d{6})_(\d{6})", f).groups() for f in raw_data_files])),
-                         key=itemgetter(2), reverse=True), key=itemgetter(1), reverse=True)
-    return (["{} {}".format(e[0], get_date_time(e[1], e[2])) for e in unique_experiments], ["_".join(e) for e in unique_experiments])
-def get_date_time(date, time):
-    fmt_string = "%m/%d/%Y %H:%M:%S"
-    return datetime.datetime(2000+int(date[0:2]), int(date[2:4]), int(date[4:6]), int(time[0:2]), int(time[2:4]), int(time[4:6])).strftime(fmt_string)
-def call_peaks(experiment, output_folder, metadata_file, config, command):
-    update_now()
-    all_experiments= list_experiments(config)
-    exp_name = [e[1] for e in zip(all_experiments[0], all_experiments[1]) if e[0] == experiment][0]
-    exp_files = [os.path.join(os.path.expanduser(config["raw_data_folder"]), f) for f in os.listdir(os.path.expanduser(config["raw_data_folder"])) if exp_name in f and os.path.splitext(f)[1] == ""]
-    print(exp_name, exp_files)
-    peaks = write_peaks(exp_name, exp_files, output_folder, metadata_file, config)
-    write_summary(exp_name, peaks, output_folder)
-    # TODO write_plots(exp_name, peaks, output_folder, config)
-    write_config(exp_name, output_folder, config)
-    return config
-def update_now():
-    global NOW
-    NOW = datetime.datetime.now().strftime("%Y%m%d%H%M%S")
-def parse_metadata(metadata_file):
-    return pd.read_csv(metadata_file)[["Id", "Well"]]
-def load_raw_data(exp_name, exp_files):
-    for f_path in exp_files:
-        m = re.search(r"(.+)_(\d{6})_(\d{6})_c(\d+)_v(\d+)", f_path)
-        exp_date, exp_time, exp_cycle, exp_measurement = m.group(2,3,4,5)
-        print(exp_name, exp_date, exp_time, exp_cycle, exp_measurement)
-        n_datapoints = int(os.path.getsize(f_path) / 8)
-        with open(f_path, "rb") as f:
-            content = f.read()
-            a = np.array(struct.unpack("d"*n_datapoints, content))[10:]
-        yield dict(zip(["exp_name", "exp_date", "exp_time", "exp_cycle", "exp_measurement", "data_array"],
-                       [exp_name, exp_date, exp_time, exp_cycle, exp_measurement, a]))
-def generate_peaks(measurement, config):
-    filtered_signal = scipy.signal.savgol_filter(measurement["data_array"], window_length=5, polyorder=3)
-    peaks, _ = scipy.signal.find_peaks(-filtered_signal, width=config["peak_width_cutoff"], prominence=config["mass_cutoff"]*config["mass_transformation"], distance=config["peak_distance_cutoff"])
-    masses = scipy.signal.peak_prominences(-filtered_signal, peaks)[0]*(1/config["mass_transformation"])
-    for peak, mass in zip(peaks, masses):
-        yield dict(zip(["exp_name", "exp_date", "exp_time", "exp_cycle", "exp_measurement", "event_index","event_mass"],
-                       [measurement["exp_name"], measurement["exp_date"],measurement["exp_time"],measurement["exp_cycle"],measurement["exp_measurement"], peak, mass]))
-def write_peaks(exp_name, exp_files, output_folder, metadata_file, config):
-    peaks = pd.DataFrame(chain.from_iterable([generate_peaks(measurement, config) for measurement in load_raw_data(exp_name, exp_files)]))
-    if metadata_file:
-        metadata = parse_metadata(metadata_file)
-        peaks = peaks.astype({'exp_measurement':'int32'}).merge(metadata.astype({'Id':'int32'}), how='left', left_on='exp_measurement', right_on='Id')
-        peaks["Well"] = ["".join([w[0],w[1:].zfill(2)]) for w in peaks["Well"]]
-    out_path = os.path.join(os.path.expanduser(output_folder), "{}_{}_peaks.csv".format(NOW, exp_name))
-    peaks.to_csv(out_path, index=False)
-    return peaks
-def write_summary(exp_name, peaks, output_folder):
-    print(peaks.columns)
-    if "Well" in peaks.columns:
-        summary = peaks.groupby(["Well", "exp_cycle"])["event_mass"].describe()
-    else:
-        summary = peaks.groupby(["exp_measurement", "exp_cycle"])["event_mass"].describe()
-    out_path = os.path.join(os.path.expanduser(output_folder), "{}_{}_summary.csv".format(NOW, exp_name))
-    summary.to_csv(out_path)
-def write_config(exp_name, output_folder, config):
-    output_path = os.path.join(os.path.expanduser(output_folder), "{}_{}_config.json".format(NOW, exp_name))
-    with open(output_path, "w") as f:
-        json.dump(config, f)
--- a/lifescale/mass_peak_caller/peak_caller_gui.py
+++ b/lifescale/mass_peak_caller/peak_caller_gui.py
-import os
-import re
-from datetime import datetime
-from functools import partial
-from operator import itemgetter
-from gooey import Gooey, GooeyParser
-import naive_peaks
-import configure_peakcaller
-DISPATCHER = {
-    "call_peaks": naive_peaks.call_peaks,
-    "config": configure_peakcaller.configure_peakcaller
-}
-def show_error_modal(error_msg):
-    """ Spawns a modal with error_msg"""
-    # wx imported locally so as not to interfere with Gooey
-    import wx
-    app = wx.App()
-    dlg = wx.MessageDialog(None, error_msg, 'Error', wx.ICON_ERROR)
-    dlg.ShowModal()
-    dlg.Destroy()
-def add_call_peak_gui(subs, config):
-    p = subs.add_parser('call_peaks', prog='Call Mass Peaks', help='Get Mass Peaks from Raw Lifescale Data')
-    p.add_argument(
-        'experiment',
-        metavar='Choose an Experiment',
-        help='Choose the name of an experiment',
-        widget='Dropdown',
-        choices=naive_peaks.list_experiments(config)[0])
-    p.add_argument('output_folder', widget="DirChooser")
-    p.add_argument('--metadata_file', '-f', widget="FileChooser", help="If provided, convert vial ids to sample names. Should be the exported csv file called PanelData.csv.")
-def add_config_gui(subs, config):
-    p = subs.add_parser('config', prog="Configure Program", help="Options to change where this program looks for data, and the calibration used for frequency to mass conversion.")
-    p.add_argument('--raw_data_folder', widget="DirChooser", help="currently {}".format(config["raw_data_folder"]))
-    p.add_argument('--mass_transformation', type=float, help='currently {} Hz/fg'.format(config["mass_transformation"]))
-    p.add_argument('--mass_cutoff', '-m', type=float, default=20, help='currently {} fg - minimum mass of the peak (minimum 5fg recommended)'.format(config["mass_cutoff"]))
-    p.add_argument('--peak_width_cutoff', '-w', type=float, default=5, help='currently {} - width cutoff for peaks - minimum datapoints looking larger than noise'.format(config["peak_width_cutoff"]))
-    p.add_argument('--peak_distance_cutoff', '-d', type=float, default=5, help='currently {} - distance cutoff for peaks - minimum datapoints between peaks'.format(config["peak_distance_cutoff"]))
-@Gooey(program_name='Mass Peak Caller', image_dir='./images', required_cols=1)
-def main():
-    current_config, file_not_found = configure_peakcaller.load_config()
-    if file_not_found:
-        show_error_modal("No configuration file found at {}.\nWrote default configuration to that location.\nContinuing with default config.".format(file_not_found))
-    parser = GooeyParser(description='Get Mass Peaks from Raw Lifescale Data')
-    subs = parser.add_subparsers(help='commands', dest='command')
-    add_call_peak_gui(subs, current_config)
-    add_config_gui(subs, current_config)
-    args = parser.parse_args()
-    opts = vars(args)
-    func = partial(DISPATCHER[args.command], config=current_config)
-    current_config = func(**opts)
-if __name__ == '__main__':
-    main()
--- a/lifescale/models/ls_data.py
+++ b/lifescale/models/ls_data.py
@@ -206,7 +206,7 @@ class LSData:
            if item_not_nan_max_idx is np.nan:  # No items that are not NaN!
                settings_dict[row[0]] = None
            else:
-                tmp_list = short_row[:item_not_nan_max_idx].to_list()
+                tmp_list = short_row.loc[:item_not_nan_max_idx].to_list()
                num_items = len(tmp_list)
                if num_items == 1:
                    settings_dict[row[0]] = tmp_list[0]
@@ -399,7 +399,9 @@ class LSData:
    def export_csv_files(self, output_filepath, verbose=True, sort_by_time=False):
        """Write CSV files to output directory"""
+        if verbose:
            print('Write output')
        # Checks:
        if not os.path.exists(output_filepath):
            raise AssertionError(f'The output path does not exist: {output_filepath}')
@@ -468,6 +470,7 @@ class LSData:
        else:
            return f'Not data available yet.'
 def remove_space_from_column_names(df):
    """Removes white space from column names of input dataframe."""
    col_names = df.columns
@@ -477,13 +480,14 @@ def remove_space_from_column_names(df):
    df.columns = col_names_corrected
    return df
 def row_to_list(row) -> list:
    """Convert dataframe row to list and remove all trailing NaN values."""
    item_not_nan_max_idx = row.loc[~row.isna()].index.max()
    if item_not_nan_max_idx is np.nan:  # No items that are not NaN!
        out_list = []
    else:
-        out_list = row[:item_not_nan_max_idx].to_list()
+        out_list = row.loc[:item_not_nan_max_idx].to_list()
    return out_list

--- a/lifescale/scripts/ls2csv.py
+++ b/lifescale/scripts/ls2csv.py
+"""Converstion program from xlsm to csv
+Copyright (C) 2022  Andreas Hellerschmied <heller182@gmx.at>"""
+from lifescale.models.ls_data import LSData
+def ls2csv(xlsm_filename, oputput_dir, verbose=True):
+    """Convert lifescale output file (xlsm) to csv files."""
+    ls_data = LSData.from_xlsm_file(input_xlsm_filename=xlsm_filename, verbose=verbose)
+    ls_data.export_csv_files(oputput_dir, verbose=verbose)
--- a/lifescale/scripts/run_gui.py
+++ b/lifescale/scripts/run_gui.py
-"""Start the lifescale GUI from here!"""
+"""Start the lifescale GUI from here!
+Copyright (C) 2022  Andreas Hellerschmied <heller182@gmx.at>
+"""
 from lifescale.gui.gui_main import main

--- a/makefile
+++ b/makefile
@@ -12,3 +12,19 @@ init:
 # Convert *.ui files from Qt Designer to Python files:
 py_gui:
 	pyuic6 -o lifescale/gui/MainWindow.py lifescale/gui/MainWindow.ui
+# Package test (install in current virtual environment, editable install with pip)
+test_pack:
+	pip install -e .
+# Uninstall test package
+test_pack_uninstall:
+	pip uninstall gravtools
+# Build package with setuptools (new version):
+build:
+	python -m build
+# Upload package to pypi.org
+pypi_push:
+	twine upload --verbose dist/*
\ No newline at end of file
--- a/pyproject.toml
+++ b/pyproject.toml
+[build-system]
+requires = ["setuptools", "wheel"]
+build-backend = "setuptools.build_meta"
\ No newline at end of file
--- a/setup.cfg
+++ b/setup.cfg
+[metadata]
+name = lifescale-utils
+version = attr: lifescale.__version__
+author = Andreas Hellerschmied
+author_email = heller182@gmx.at
+url = https://gitlab.com/hellerdev/lifescale_utils
+description = Lifescale utility software.
+long_description = file: README.md
+long_description_content_type = text/markdown
+keywords = Lifescale
+license = GNU GPLv3
+classifiers =
+    License :: OSI Approved :: GNU General Public License (GPL)
+    Programming Language :: Python :: 3
+[options]
+python_requires = >=3.6, <4
+packages = find:
+zip_safe = True
+include_package_data = True
+install_requires =
+    numpy
+    pandas
+    openpyxl
+[options.entry_points]
+console_scripts =
+    ls2csv = lifescale.command_line.command_line:ls2csv