Create CSV Reports from GIT Repositories containing your commits

Some months ago, I got the need to run over several GIT Repositories and collect the work I did on each day. The play was to gather all the data and collect them in different CSV files.
Since I wasn’t able to find a ready script for this task, I guess it is a good candidate for a quick blog post :-).

The first part is a file folders.txt with a list of all GIT Repositories that we want to analyse. (All folders are Subfolders of a root Directory /Users/user/GIT. This root folder can be changed later on.)

cat folders.txt

tools
utils
customer1/project2
customer2/project1
customer2/project2
customer3/project1

The script does several things:

  1. Going through every repository and collecting Project,Date,Commit,Name,Email,Comment of each commit.
  2. We also need to do some filtering first to deal with characters in the commit messages, that might break the CSV later on.
  3. The last step is to split the complete log file into the different months.

At the moment the script does only run for one specific year, but that can be changed with adding another loop to run it for a bunch of years.

The Source of the Script is:

#!/bin/bash

#YEAR=$(date +"%Y")
HEADER=Project,Date,Commit,Name,Email,Comment
YEAR=2022
ROOT=$(PWD)
GIT_ROOT=$HOME/GIT
PROJECTS=$(cat folders.txt)
TMP_DIR=/tmp/csv
CREATOR="Philipp Haussleiter"
echo "" > /tmp/csv/all.csv
mkdir -p csv/$YEAR $TMP_DIR
rm -Rf $TMP_DIR/*

for PROJECT in ${PROJECTS}; do
    echo "creating log of ${PROJECT}"
    DIR=${GIT_ROOT}/${PROJECT}
    BASENAME=$(basename $DIR)
    cd ${DIR}
    git log --pretty=format:__${BASENAME}__,__%cs__,__%h__,__%an__,__%ae__,__%s__ > /tmp/csv/${BASENAME}.a.log
    cat /tmp/csv/${BASENAME}.a.log | sed -r 's/[\"]+/\"\"/g' > /tmp/csv/${BASENAME}.b.log
    cat /tmp/csv/${BASENAME}.b.log | sed -r 's/__+/\"/g' > /tmp/csv/${BASENAME}.log
    echo "" >> /tmp/csv/${BASENAME}.log
    cat /tmp/csv/${BASENAME}.log >> /tmp/csv/all.csv
    rm /tmp/csv/${BASENAME}.a.* /tmp/csv/${BASENAME}.b.*
    cd ${ROOT}
done

for MONTH in $(seq -f "%02g" 1 12); do
    FILE=csv/$YEAR/${YEAR}-${MONTH}.csv
    FILTER=${YEAR}-${MONTH}
    echo $HEADER > $FILE
    cat /tmp/csv/all.csv |grep "$CREATOR" |grep $FILTER >> $FILE
    echo $HEADER > csv/$YEAR/all.csv
    cat /tmp/csv/all.csv |grep "$CREATOR" >> csv/all.csv
    echo $FILE
done

After running the script for the years 2021 and 2022, you get a folder structure like this:

csv
├── 2021
│   ├── 2021-01.csv
│   ├── 2021-02.csv
│   ├── 2021-03.csv
│   ├── 2021-04.csv
│   ├── 2021-05.csv
│   ├── 2021-06.csv
│   ├── 2021-07.csv
│   ├── 2021-08.csv
│   ├── 2021-09.csv
│   ├── 2021-10.csv
│   ├── 2021-11.csv
│   └── 2021-12.csv
└── 2022
    ├── 2022-01.csv
    ├── 2022-02.csv
    ├── 2022-03.csv
    ├── 2022-04.csv
    ├── 2022-05.csv
    ├── 2022-06.csv
    ├── 2022-07.csv
    ├── 2022-08.csv
    ├── 2022-09.csv
    ├── 2022-10.csv
    ├── 2022-11.csv
    └── 2022-12.csv

Teaching Mailcow how to deal with Ham/Spam

The good must be put in the dish, the bad you may eat if you wish.

Cinderella

Mailcow is a groupware solutions, that is mainly used for email messaging. With Mailcow, you can setup your own Docker-based Mail-Server + Addons.

Mailcow uses rspamd to filter out Spam Messages.
However, after some time, there is a need to fine-tune the Spam (Spam Messages)/Ham (“good” Messages) filtering.

There is a documented method to learn Spam from existing emails within a directory, but especially for non-technical users, that might be hard to understand.

So I updated this method a little bit:

  • every user hast two folders rspamd/spam and rspamd/ham in their home directory.
  • Every user can new just drop new spam messages into the spam and false spam messages into the ham folder.
  • A cron Jobs runs every hour to parse the user directories for new files and updates the rspamd behaviour.

The script for SPAM learning looks like this (assumed that mailcow is installed in /opt/mailcow-dockerized):

#!/bin/bash

cd /opt/mailcow-dockerized
for u in $(ls /home); do
    ""mv /home/$u/rspamd/spam/* ./data/rspamd/spam/""
done
for file in ""./data/rspamd/spam/*""; do 
    docker exec -i $(docker-compose ps -q rspamd-mailcow) rspamc learn_spam < "$file"
done

""rm -Rf ./data/rspamd/spam/*""

There is also a similar script for HAM learning:

#!/bin/bash

cd /opt/mailcow-dockerized
for u in $(ls /home); do
    ""mv /home/$u/rspamd/ham/* ./data/rspamd/ham/""
done
for file in ""./data/rspamd/ham/*""; do 
    docker exec -i $(docker-compose ps -q rspamd-mailcow) rspamc learn_ham < "$file"
done
""rm -Rf ./data/rspamd/ham/*""

Both scripts will produce some output, so a good way of running it via cron, is to pipe the output into a log file.

Using the MacOS airport utility

Using the MacOS airport utility.

Sometimes you need to gather information abouth your current WiFi Connection of you Mac via CLI only (maybe you just have a remote SSH Connection to do so).

With the airport tool, there is a handy utility to perform most of the tasks, that you would normally do via the UI.

You can find that tool in /System/Library/PrivateFrameworks/Apple80211.framework/Versions/Current/Resources/airport. To run it, you need to have elevated access rights (run it with sudo).

Best thing is to create an alias first before using the tool:

alias airport='sudo /System/Library/PrivateFrameworks/Apple80211.framework/Versions/A/Resources/airport'

Display the WiFi Preferences

philipp@Imotep ~ % airport prefs
AirPort preferences for en0:

DisconnectOnLogout=NO
JoinMode=Strongest
JoinModeFallback=DoNothing
RememberRecentNetworks=YES
RequireAdminIBSS=NO
RequireAdminNetworkChange=NO
RequireAdminPowerToggle=NO
AllowLegacyNetworks=NO
WoWEnabled=NO

Listing all available WiFi Networks

philipp@Imotep ~ % airport  -s
Password:
            SSID BSSID             RSSI CHANNEL HT CC SECURITY (auth/unicast/group)
                  Network1 24:65:11:d3:bd:85 -88  1,+1    Y  -- RSN(PSK/AES/AES)
                  Network2 ac:22:05:1c:12:4d -83  40      Y  -- RSN(PSK/AES/AES)
                      Home 3c:a6:2f:78:22:cc -80  11      Y  DE RSN(PSK/AES/AES)
                      Home 3c:a6:2f:78:22:cb -78  40      Y  DE RSN(PSK/AES/AES)
          Vodafone Hotspot ae:22:15:1c:12:6f -77  1       Y  EU NONE
                  Network2 ac:22:05:1c:12:6f -77  1       Y  EU RSN(PSK/AES/AES)
                      Home b8:be:f4:87:2e:b0 -74  6,+1    Y  DE RSN(PSK,FT-PSK/AES/AES)
                      Home b8:be:f4:87:2e:b1 -73  48      Y  DE RSN(PSK,FT-PSK/AES/AES)
muenchen.freifunk.net/welt 66:b6:fc:72:c2:28 -51  6       Y  DE NONE
muenchen.freifunk.net/welt 9c:c9:eb:4f:a7:91 -59  44      Y  DE NONE

Listing a specific WiFi Network

Use airport -s <SSID>

philipp@Imotep ~ % airport -s Home
                            SSID BSSID             RSSI CHANNEL HT CC SECURITY (auth/unicast/group)
                          Home 3c:a6:2f:78:22:cc -81  11      Y  DE RSN(PSK/AES/AES)
                          Home 3c:a6:2f:78:22:cb -78  100     Y  DE RSN(PSK/AES/AES)
                          Home b8:be:f4:87:2e:b0 -74  6,+1    Y  DE RSN(PSK,FT-PSK/AES/AES)
                          Home b8:be:f4:87:2e:b1 -69  48      Y  DE RSN(PSK,FT-PSK/AES/AES)

Display the Metrics of your current connection

philipp@Imotep ~ % airport  -I
     agrCtlRSSI: -44
     agrExtRSSI: 0
    agrCtlNoise: -95
    agrExtNoise: 0
          state: running
        op mode: station
     lastTxRate: 144
        maxRate: 144
lastAssocStatus: 0
    802.11 auth: open
      link auth: wpa2-psk
          BSSID: e4:c3:2a:dd:36:f8
           SSID: Home
            MCS: 15
  guardInterval: 800
            NSS: 2
        channel: 9

Get the SSID of the currently connected WiFi

philipp@Imotep ~ %  ioreg -l -n AirPortDriver | grep IO80211SSID | sed 's/^.*= "\(.*\)".*$/\1/; s/ /_/g'

Home

Project 364

I wish you all a great new year 2023!

New year, new plans.

Over the last months, I collected a lot of content pieces with the plan to publish it one day. So for this year, I decided to force myself into publishing one content pieces each day. That means a Blog Post, a tipp, a tutorial.

That means 364 things to publish. I plan to publish them on at least two blogs:

Maybe I will add one or two more blogs later on.

I did create a listing with the numbering, the title and links to all posts:

However, there will be some “cheats” :-).

  1. I will try to publish some posts in english, as well as in german. So then the german and english version will each count as one seperate blog post.
  2. I might not be able to publish a post on-time. (e.g. before, on the exact date), but I will take care, that there will be a post for that date eventually.

    The Goal is to have that number 364 at the end of 2023.

The topics of these blog posts will be mainly technical. Mostly in the area of Software development. Maybe some organisional topics as well.
I will also post some tuturials abouth interestings projects or SaaS as well.

As always, feel free to comment and aks questions some of you did in the past.

Again, all the best to you and your family in 2023!

Best Regards,
Philipp