Re: Simple analytics for docs.kernel.org and patchwork, please?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Em Fri, 23 Feb 2024 08:31:54 -0800
Jakub Kicinski <kuba@xxxxxxxxxx> escreveu:

> Hi!
> 
> We have a few netdev-related bots with various simple status pages.
> I hooked them up to analytics recently, here's the dash:
> https://plausible.io/netdev.bots.linux.dev
> 
> Plausible was described here: https://lwn.net/Articles/822568/
> it's supposedly-open, and privacy-focused, no cookies etc.
> 
> It's useful for me when deciding where to invest my time,
> and to back up the efforts to my employer with some data.
> 
> Now, most of us agree that kernel docs leave something to be desired.
> At the same time maintainers are repeatedly faced with people who post
> code without reading the docs, which puts the time invested in writing
> them into question. I can't help but think that providing some
> analytics for docs.kernel.org traffic would be beneficial. 
> I would use it.
> 
> Thoughts?
> 
> Does anyone think that even non-intrusive analytics are a no go?
> 
> Does anyone know better alternatives than Plausible?

I have a small hand-made script using Pandas/Seaborn to produce
some patchwork statistics as can be seen at:

	https://linuxtv.org/patchwork_stats.php

Feel free to use it as a basis to get some stats.

You may need to modify it to cover stats per project (on Linux media, there's
just two projects, and the second one with handful number of patches per year).
So, we didn't need to filter it per project.

Also, please notice that the second query doesn't use any index on Patwork 3.1. 

I ended manually creating an index to speed it up on mariadb with:

	CREATE INDEX idx_patchwork_patch_stateid_date on patchwork_patch(state_id, date);

I hope that helps.

Regards,
Mauro

---

#!/usr/bin/env python3
# SPDX-License-Identifier: GPL-2.0
# Copyright(c) Mauro Carvalho Chehab <mchehab@xxxxxxxxxx>

from datetime import datetime, date, timedelta
from matplotlib.dates import DateFormatter
from matplotlib.pyplot import xlim
from pandas import read_sql
from seaborn import relplot, set_style,axes_style
from sqlalchemy import create_engine, text

DIR = './'

def log(msg):
    now = datetime.now().strftime("%d/%m/%Y %H:%M:%S")
    print(f'{now}: {msg}')

today = date.today()

# Consider yesterday as the final date
end_date = today - timedelta(days=1)

# Two complete years + this month
start_date = end_date.replace(year=today.year - 2).replace(month=today.month - 1)
start_date = start_date.replace(day=1)

interval = f'date >= "{start_date}" and date <= "{end_date}"'

log("Connecting to database")

engine = create_engine("mysql://patchwork:yaicCoqui@localhost/patchwork?charset=utf8mb4")

palette = "bright"
background = "#555555"

style = {
    'axes.facecolor':background,
    'grid.color':'white',
    'axes.edgecolor': 'orange',
    'axes.labelcolor': 'orange',
    'text.color': '#ffcc00',
    'xtick.color': 'white',
    'ytick.color': 'white',
    'patch.edgecolor': 'orange',
    'figure.facecolor':'black'
}


xformatter = DateFormatter("%Y-%m")

with engine.connect() as conn:
     # Total patches
    query = text(f'select DATE_FORMAT(date, "%Y-%m") AS date, count(*) AS patches from patchwork_patch WHERE {interval} group by DATE_FORMAT(date, "%Y-%m") ORDER BY YEAR(date), MONTH(DATE)')
    log(query)
    total = read_sql(query, con=conn, parse_dates=['date'])

    log("Creating total patches graph")

    set_style(style="darkgrid",rc=style)

    print({k: v for k, v in axes_style().items() if "color" in k})

    g = relplot(kind="line", marker='x', markers=True, data=total, x="date", y="patches")
    g.set_axis_labels("Date", "Number of patches", labelpad=10)
    g.set(title=f'Number of patches received per month between {start_date} and {end_date}')
    g.figure.set_size_inches(14, 6)
    print(g.ax)
    g.ax.margins(.05)
    g.ax.autoscale_view()
    g.ax.edgecolor="black"
    g.despine(trim=True, left=True, bottom=True)
    xlim(start_date, end_date)
    g.axes[0,0].xaxis.set_major_formatter(xformatter)

    g.savefig(DIR + 'patches_per_date.svg')

    # Patches per state
    query = text(f'select DATE_FORMAT(date, "%Y-%m") as date, st.name as State, count(*) as patches from patchwork.patchwork_patch AS p, patchwork_state as st where state_id = st.id and {interval} group by DATE_FORMAT(date, "%Y-%m"), st.id')
    log(query)
    per_state = read_sql(query, con=conn, parse_dates=['date'])

    log("Creating patches per state")

    per_state.set_index('date', inplace=True)

    g = relplot(kind="line", data=per_state, x="date", y="patches", hue="State", markers="State", marker="X", palette=palette)

    g.set_axis_labels("Date", "Number of patches", labelpad=10)
    g.set(title=f'Number of patches per state received per month between {start_date} and {end_date}')
    g.figure.set_size_inches(13.5, 6)
    g.ax.margins(.05)
    g.ax.autoscale_view()
    g.ax.edgecolor=False
    g.despine(trim=True, left=True, bottom=True)
    xlim(start_date, end_date)
    g.axes[0,0].xaxis.set_major_formatter(xformatter)

    g.add_legend(loc='upper left', bbox_to_anchor=(1.12, 0.5))
    g.savefig(DIR + 'patches_per_state.svg')

    log("Done.")




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux