Calculating descriptives on matching groups

G

glpsx

Guest
I'm working with a pandas DataFrame having the following structure:

import pandas as pd
import numpy as np

df = pd.DataFrame({'brand' : ['A', 'B', 'A', 'B', 'A', 'B', 'A', 'B', 'C', 'A', 'B', 'C', 'A', 'B', 'C'],
'value' : [111, 111, 222, 222, 333, 333, 444, 444, 444, 555, 555, 555, 666, 777, 888]})

print(df)

brand value
0 A 111
1 B 111
2 A 222
3 B 222
4 A 333
5 B 333
6 A 444
7 B 444
8 C 444
9 A 555
10 B 555
11 C 555
12 A 666
13 B 777
14 C 888


I would like to calculate somes descriptives on the above table:

First, count the number of groups having different brands that match to the same value. Here's a helping visual:

enter image description here

As you can see, there are 5 groups where brands inside each group are different that match to the same value. So basically, I want to write a query that would output the number of those groups (5 in this case).


Second, I would like to have a more detailed view of each group. Specifically, calculate the:

  • Number of groups with a single records : 3 (the last 3 records)
  • Number of groups with 2 records : 3 (groups red, green and blue)
  • Number of groups with 3 records : 2 (groups purple and black)

I tried to come up with a solution on my with without any success using groupby statements. Any help would be appreciated.

Continue reading...
 
Top