G

#### glpsx

##### Guest

import pandas as pd

import numpy as np

df = pd.DataFrame({'brand' : ['A', 'B', 'A', 'B', 'A', 'B', 'A', 'B', 'C', 'A', 'B', 'C', 'A', 'B', 'C'],

'value' : [111, 111, 222, 222, 333, 333, 444, 444, 444, 555, 555, 555, 666, 777, 888]})

print(df)

brand value

0 A 111

1 B 111

2 A 222

3 B 222

4 A 333

5 B 333

6 A 444

7 B 444

8 C 444

9 A 555

10 B 555

11 C 555

12 A 666

13 B 777

14 C 888

I would like to calculate somes descriptives on the above table:

First, count the number of groups having different brands that match to the same value. Here's a helping visual:

As you can see, there are 5 groups where brands inside each group are different that match to the same value. So basically, I want to write a query that would output the number of those groups (5 in this case).

Second, I would like to have a more detailed view of each group. Specifically, calculate the:

- Number of groups with a single records : 3 (the last 3 records)
- Number of groups with 2 records : 3 (groups red, green and blue)
- Number of groups with 3 records : 2 (groups purple and black)

I tried to come up with a solution on my with without any success using groupby statements. Any help would be appreciated.

Continue reading...