[Solved] Python – How to unpack a list of list of tuples in a dataframe

I have a dataframe, and one column contains a list of lists of tuples. I want to unpack this list of lists of tuples into N amount of rows in the dataframe with N being the length of the list of list of tuple. I have tried using a solution from other related topic but I am not able to do this for my problem

   import pandas as pd
   import numpy as np

 index       element              Lanes   Category
   0     [[(A, A), (B, B)],         M      1
         [(B, B), (C, C)]]

   1     [[(A, A), (D, D)],         B      2
         [(D, D), (L, L)],
         [(L, L), (O, O)]]

Given this input dataframe, how do I convert this to the long format resulting in:-

   index       element           Lanes   Category
   0      (A, A), (B, B)          M       1
   1      (B, B), (C, C)          M       1

   2      (A, A), (D, D)          B       2
   3      (D, D), (L, L)          B       2
   4      (L, L), (O, O)          B       2   
Enquirer: Dan

||

Solution #1:

Here’s one way adapting @WenYoBen’s answer:

lens = df.element.str.len()
pd.DataFrame({'element': sum(df.element.tolist(),[]),
            'Category': df.Category.repeat(lens).values,
             'Lanes': df.Lanes.repeat(lens).values})

        element         Category Lanes
0  [(A, A), (B, B)]         1     M
1  [(B, B), (C, C)]         1     M
2  [(A, A), (D, D)]         2     B
3  [(D, D), (L, L)]         2     B
4  [(L, L), (O, O)]         2     B
Respondent: yatu

Solution #2:

Here is an alternative way –

import pandas as pd
import numpy as np

d = {'element' : pd.Series([[[('A', 'A'), ('B', 'B')],[('B', 'B'), ('C', 'C')]],[[('A', 'A'), ('D', 'D')],[('D', 'D'), ('L', 'L')],[('L', 'L'), ('O', 'O')]]]),
      'Lanes' : pd.Series(['M','B']),
      'Category' : pd.Series([1,2])}

# creates Dataframe.
df = pd.DataFrame(d)

# print the data.
print(df)

df1=df.element.apply(pd.Series)
      .merge(df, right_index = True, left_index = True)
      .drop(["element"], axis = 1) 
      .melt(id_vars = ['Lanes', 'Category'], value_name = "element")
      .drop("variable", axis = 1)
      .dropna()
      .reset_index(drop=True)

print(df1)
Respondent: Anupam Bera

The answers/resolutions are collected from stackoverflow, are licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0 .

Leave a Reply

Your email address will not be published.