Skip to content

Running Coffea with Dask/Futures executor throw an error: cannot pickle 'property' object #302

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
oshadura opened this issue May 12, 2020 · 2 comments · Fixed by #337
Closed
Labels
bug Something isn't working

Comments

@oshadura
Copy link

oshadura commented May 12, 2020

Describe the bug

Running an example of Coffea with Dask/Futures executor with METProcessor(processor.ProcessorABC) (one of ADL examples: https://github.com/mat-adamec/coffea-benchmarks) cannot pickle 'property' object

I am almost sure it is a problem with Python version...

To Reproduce

import os

from coffea import hist
from coffea.analysis_objects import JaggedCandidateArray
import coffea.processor as processor

from dask.distributed import Client, LocalCluster
from dask_jobqueue import HTCondorCluster

fileset = {
    'Jets': { 'files': ['root://eospublic.cern.ch//eos/root-eos/benchmark/Run2012B_SingleMu.root'],
             'treename': 'Events'
            }
}

class METProcessor(processor.ProcessorABC):
    def __init__(self):
        self._columns = ['MET_pt']
        dataset_axis = hist.Cat("dataset", "")
        MET_axis = hist.Bin("MET", "MET [GeV]", 50, 0, 100)
        self._accumulator = processor.dict_accumulator({
            'MET': hist.Hist("Counts", dataset_axis, MET_axis),
            'cutflow': processor.defaultdict_accumulator(int)
        })

    @property
    def accumulator(self):
        return self._accumulator

    @property
    def columns(self):
        return self._columns

    def process(self, df):
        output = self.accumulator.identity()
        MET = df['MET_pt']
        output['cutflow']['all events'] += MET.size
        output['cutflow']['number of chunks'] += 1
        output['MET'].fill(dataset=dataset, MET=MET.flatten())
        return output

    def postprocess(self, accumulator):
        return accumulator

client = Client(processes=False, dashboard_address=None)

exe_args = {
        'client': client,
    }
output = processor.run_uproot_job(fileset,
                                treename = 'Events',
                                processor_instance = METProcessor(),
                                executor = processor.dask_executor,
                                executor_args = exe_args
                                )

hist.plot1d(output['MET'], overlay='dataset', fill_opts={'edgecolor': (0,0,0,0.3), 'alpha': 0.8})

for key, value in output['cutflow'].items():
    print(key, value)

Output

Traceback (most recent call last):#######] | 100% Completed |  3.3s
  File "adl1.py", line 65, in <module>
    output = processor.run_uproot_job(fileset,
  File "/usr/lib/python3.8/site-packages/coffea/processor/executor.py", line 774, in run_uproot_job
    pi_to_send = lz4f.compress(cloudpickle.dumps(processor_instance), compression_level=pi_compression)
  File "/usr/lib/python3.8/site-packages/cloudpickle/cloudpickle_fast.py", line 62, in dumps
    cp.dump(obj)
  File "/usr/lib/python3.8/site-packages/cloudpickle/cloudpickle_fast.py", line 538, in dump
    return Pickler.dump(self, obj)
TypeError: cannot pickle 'property' object

Desktop (please complete the following information):

  • OS: Manjaro
  • Python 3.8.2
  • Coffea: 0.6.39

CC: @mat-adamec

@oshadura oshadura added the bug Something isn't working label May 12, 2020
@nsmith-
Copy link
Member

nsmith- commented Aug 25, 2020

I think this is probably due to cloudpipe/cloudpickle#329
We can pin cloudpickle >= 1.2.3 to avoid it, I think.

nsmith- added a commit that referenced this issue Aug 25, 2020
@oshadura
Copy link
Author

oshadura commented Aug 27, 2020

@nsmith- I just tried and I can confirm that the current coffea version: '0.6.43' together with cloudpickle '1.5.0' is working without problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants