I have a data frame that stores store name and daily sales count. I am trying to insert this to Salesforce using the Python script below.

However, I get the following error:

TypeError: Object of type 'int64' is not JSON serializable

Below, there is the view of the data frame.

Storename,Count
Store A,10
Store B,12
Store C,5

I use the following code to insert it to Salesforce.

update_list = []
for i in range(len(store)):
    update_data = {
        'name': store['entity_name'].iloc[i],
        'count__c': store['count'].iloc[i] 
    }
    update_list.append(update_data)

sf_data_cursor = sf_datapull.salesforce_login()
sf_data_cursor.bulk.Account.update(update_list)

I get the error when the last line above gets executed.

How do I fix this?

json does not recognize NumPy data types. Convert the number to a Python int before serializing the object:

'count__c': int(store['count'].iloc[i])

You can define your own encoder to solve this problem.

import json
import numpy as np

class NpEncoder(json.JSONEncoder):
    def default(self, obj):
        if isinstance(obj, np.integer):
            return int(obj)
        if isinstance(obj, np.floating):
            return float(obj)
        if isinstance(obj, np.ndarray):
            return obj.tolist()
        return super(NpEncoder, self).default(obj)

# Your codes .... 
json.dumps(data, cls=NpEncoder)

I’ll throw in my answer to the ring as a bit more stable version of @Jie Yang’s excellent solution.

My solution

numpyencoder and its repository.

from numpyencoder import NumpyEncoder

numpy_data = np.array([0, 1, 2, 3])

with open(json_file, 'w') as file:
    json.dump(numpy_data, file, indent=4, sort_keys=True,
              separators=(', ', ': '), ensure_ascii=False,
              cls=NumpyEncoder)

The breakdown

If you dig into hmallen’s code in the numpyencoder/numpyencoder.py file you’ll see that it’s very similar to @Jie Yang’s answer:


class NumpyEncoder(json.JSONEncoder):
    """ Custom encoder for numpy data types """
    def default(self, obj):
        if isinstance(obj, (np.int_, np.intc, np.intp, np.int8,
                            np.int16, np.int32, np.int64, np.uint8,
                            np.uint16, np.uint32, np.uint64)):

            return int(obj)

        elif isinstance(obj, (np.float_, np.float16, np.float32, np.float64)):
            return float(obj)

        elif isinstance(obj, (np.complex_, np.complex64, np.complex128)):
            return {'real': obj.real, 'imag': obj.imag}

        elif isinstance(obj, (np.ndarray,)):
            return obj.tolist()

        elif isinstance(obj, (np.bool_)):
            return bool(obj)

        elif isinstance(obj, (np.void)): 
            return None

        return json.JSONEncoder.default(self, obj)

A very simple numpy encoder can achieve similar results more generically.

Note this uses the np.generic class (which most np classes inherit from) and uses the a.item() method.

If the object to encode is not a numpy instance, then the json serializer will continue as normal. This is ideal for dictionaries with some numpy objects and some other class objects.

import json
import numpy as np

def np_encoder(object):
    if isinstance(object, np.generic):
        return object.item()

json.dumps(obj, default=np_encoder)

If you are going to serialize a numpy array, you can simply use ndarray.tolist() method.

From numpy docs,

a.tolist() is almost the same as list(a), except that tolist changes numpy scalars to Python scalars

In [1]: a = np.uint32([1, 2])

In [2]: type(list(a)[0])
Out[2]: numpy.uint32

In [3]: type(a.tolist()[0])
Out[3]: int

This might be the late response, but recently i got the same error. After lot of surfing this solution helped me.

def myconverter(obj):
        if isinstance(obj, np.integer):
            return int(obj)
        elif isinstance(obj, np.floating):
            return float(obj)
        elif isinstance(obj, np.ndarray):
            return obj.tolist()
        elif isinstance(obj, datetime.datetime):
            return obj.__str__()

Call myconverter in json.dumps() like below.
json.dumps('message', default=myconverter)

If you have this error

TypeError: Object of type ‘int64’ is not JSON serializable

You can change that specific columns with int dtype to float64, as example:

df = df.astype({'col1_int':'float64', 'col2_int':'float64', etc..})

Float64 is written fine in Google Spreadsheets

If you have control over the creation of DataFrame, you can force it to use standard Python types for values (e.g. int instead of numpy.int64) by setting dtype to object:

df = pd.DataFrame(data=some_your_data, dtype=object)

The obvious downside is that you get less performance than with primitive datatypes. But I like this solution tbh, it’s really simple and eliminates all possible type problems. No need to give any hints to the ORM or json.

I was able to make it work with loading the dump.

Code:

import json

json.loads(json.dumps(your_df.to_dict()))

update_data = {
    'name': str(store['entity_name'].iloc[i]),
    'count__c': str(store['count'].iloc[i]) 
}