Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Location upload refactor #21092

Merged
merged 50 commits into from
Jul 25, 2018
Merged

Location upload refactor #21092

merged 50 commits into from
Jul 25, 2018

Conversation

esoergel
Copy link
Contributor

@esoergel esoergel commented Jul 2, 2018

Product-facing changes

  • location_id is now a required field in order to update a location. Previously, users had issues where they'd intend to add a new location, but the site_code they specified was already taken. The old behavior was to update the existing location that had that site_code, whereas now it'll fail. This ID requirement is consistent with user and case uploads.
  • Location restricted users can now access the location uploader. Restricted users can include any location they like in the upload, but they may only modify accessible ones.
  • Restricted users may not modify location types.
  • Restricted and non-restricted users have the option to download an excel file with just headers in it, to use as a template for adding new locations. Restricted users currently see only this file, so they basically can't edit existing locations in bulk. The use case is primarily for bulk addition of new locations.

Notes for reviewers

This is a big PR, I'd recommend going commit-by-commit. There's one big commits, but most should be pretty digestible. Let me know if I can clarify anything. I'm treating this PR as a long-running feature branch, and PRing additions into this branch, not pushing directly to it anymore. The non user-facing changes in here are:

  • Use a read-only data class (like a namedtuple) to store cleaned user-provided data that was previously contained in attributes on the Stub classes.
  • The LocationTypeStub and LocationStub classes are now initialized a little later in the process, and require a data object and the old collection as arguments to the constructor. This allows for all methods and attributes to be effectively static. Previously, they were initialized as part of the parsing process, and later injected with the old collection, which made for a state model that I found difficult to reason with.
  • There was a test class that didn't depend on a db, but this required a substantial bit of mocking to support. I made this a db test class and dropped the mocks.
  • For another test class, I standardized the setup a bit by switching to LocationHierarchyPerTest rather than the one-off setup methods (I don't think that had existed at the time this was written).
  • To support the change to require location_ids for updates, I put all test upload rows in either NewLocRow, which doesn't supply an ID, or self.UpdateLocRow, which looks up the ID by the site code.
  • The location upload classes now require a user, which is used to make this feature location-safe. I haven't yet marked it as such, since I want to do some UI work first.

@mkangia @millerdev @sravfeyn

QA request: https://manage.dimagi.com/default.asp?279327

Oh also, I'm not crazy about the term "stub", let me know if you've got a better idea for what to call those things.

esoergel added 19 commits July 2, 2018 13:57
Many of these test methods used the same setup, so I split these up into
two classes - one with setup and one without.  I also converted the one
with setup to use the more standard LocationHierarchyPerTest class
rather than its reimplementation of it.
Now there's a namedtuple to represent the raw data and a helper class
for computing the upload.  The latter no longer mutates.
This sets parent at the very last minute, which will let us simplify
some of the other code considerably.  It also allows for the removal of
the fake_stub class, but I'm gonna commit that separately to avoid
conflicts.
This also takes the non-db tests and makes them db tests - there was a bunch of
mocking necessary to make them run, and it felt counter-productive.
I couldn't reasonably split this into multiple commits with the tests still
passing, sorry for the massive commit!
This was failing after the new ID validation
@esoergel esoergel added Open for review: do not merge A work in progress product/all-users-all-environments Change impacts all users on all environments labels Jul 2, 2018
@esoergel esoergel requested review from millerdev and mkangia July 2, 2018 18:22
@@ -384,10 +296,11 @@ def validate_and_parse_stubs_from_excel(self):
extra=", ".join(actual - expected),
))

type_stubs = self._get_types(type_sheet_reader)
type_data = [self._get_type_data(index, row)
for index, row in enumerate(type_sheet_reader)]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

E127 continuation line over-indented for visual indent

self.domain = domain
self.domain_obj = Domain.get_by_name(domain)
self.type_rows = type_rows
self.location_rows = location_rows
self.user = user
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is used in a later commit

@@ -609,21 +573,284 @@ def test_data_format(self):
)
assert_errors(result, ['index 0 should be valid decimal numbers'])

def test_custom_data(self):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test and those below it are just being moved from that class into this one

location = old_locations_by_pk[pk]
yield location.location_id
pk = location.parent_id

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made a comment on the commit, but it didn't show up in diff view, so I'll copy here:

This change is twofold:

It's difficult to assign a parent location at the time when the location stub is being created. You'd need to either reorder locations to be top-to-bottom first (which is hard, since they're not yet fully parsed), or do two passes, one to init everything, then a second to add parents. I didn't like the second option since you'd have to mutate the location after its construction. Luckily, it looked like very little actually requires the parent object - pretty much just at the point of saving, so I opted to move the lookup there and drop the linkage.

The second aspect of this is that I realized we should be looking for the old ancestors here anyways, since they're the ones that are affected by a deletion.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Ethan... Can you add couple of comments within save_locations to understand whats happening here better.

Copy link
Contributor

@millerdev millerdev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great! Left a few comments.

# check if any attributes are being updated
for attr in self.meta_data_attrs:
if getattr(old_version, attr, None) != getattr(self, attr, None):
if getattr(self.old_object, attr, None) != getattr(self.new_data, attr, None):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like both old and new should have the attribute, and if not it's a bug. Would it be safe to remove the default (None) from these getattr calls?

def UpdateLocRow(self, name, site_code, location_type, parent_code,
location_id=Ellipsis, **kwargs):
"""Like NewLocRow, but looks up location_id from the site_code"""
if location_id != Ellipsis:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you should never pass location_id to this function, would it make sense to remove the location_id argument from UpdateLocRow and change this conditional to if "location_id" in kwargs:?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason I did it this way was because NewLocRow and UpdateRow take the same args, and the test cases pass in name, site_code, location_type, parent_code, location_id as positional arguments. However, I think I've now removed the location_id from all of the args, or at least it doesn't deserve positional status anymore. I'll look into it.

for attr in self.meta_data_attrs:
setattr(self.db_object, attr, getattr(self, attr, None))
setattr(obj, attr, getattr(self.new_data, attr, None))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can the default None be removed from getattr? Seems like it could mask a bug (mistyped attribute name in self.meta_data_attrs).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice, that's a great idea, will update.

.format(loc_stub.site_code))
else:
parent_exists = (loc_stub.parent_code in self.old_collection.locations_by_site_code)
if (parent_exists and loc_stub.parent_code not in accessible_site_codes):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this allow a location-restricted user to create a new root location?

nit: unnecessary parens on above two lines

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good thinking, I should probably add a test and branch specifically for that

if not loc_stub.needs_save:
# Allow users to include any loc, as long as there are no changes
continue
if not loc_stub.is_new:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think of eliminating the negative (not) and swapping the branches of this conditional?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@@ -846,7 +846,10 @@ def post(self, request, *args, **kwargs):
# We need to start this task after this current request finishes because this
# request uses the lock_locations decorator which acquires the same lock that
# the task will try to acquire.
task = import_locations_async.apply_async(args=[domain, file_ref.download_id], countdown=10)
task = import_locations_async.apply_async(
args=[domain, file_ref.download_id, request.couch_user],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be better to just pass the User ID and look it up in Celery rather than pass the entire user object.

@snopoke
Copy link
Contributor

snopoke commented Jul 10, 2018

Got part way through this. Will try review the rest later

esoergel and others added 5 commits July 10, 2018 10:09
This was causing an intermittent test failure.

A regrettable side effect of doing this was that I had to drop the import
statements from custom_data_fields.__init__.  This appears to be because of this
change from Django 1.9:

> All models need to be defined inside an installed application or declare an
explicit app_label. Furthermore, it isn’t possible to import them before their
application is loaded. In particular, it isn’t possible to import models inside
the root package of an application.
esoergel and others added 2 commits July 11, 2018 11:32
The whole string module is deprecated - you're supposed to use methods now :(
# We need to start this task after this current request finishes because this
# request uses the lock_locations decorator which acquires the same lock that
# the task will try to acquire.
task = import_locations_async.apply_async(

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

F841 local variable 'task' is assigned to but never used

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this was a bad merge - now resolved

@esoergel esoergel force-pushed the es/location-upload branch from 95d09f5 to a0e4366 Compare July 12, 2018 16:04
excel will convert these columns to numeric types if they happen to be composed
of numerals
@esoergel esoergel added QA Passed and removed Open for review: do not merge A work in progress awaiting QA QA in progress. Do not merge labels Jul 23, 2018
@esoergel
Copy link
Contributor Author

esoergel commented Jul 23, 2018

Just added these two changes (forgot to PR them in separately):
19d08cb
a0d156b
This should be ready to merge.



@attrs(frozen=True)
class LocationData(object):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice change!

@millerdev
Copy link
Contributor

Not sure if this text failure is related to changes in this PR or not?

======================================================================
ERROR: corehq.apps.locations.tests.test_location_fixtures:LocationFixturesTest.test_all_locations_flag_returns_all_locations
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/vendor/lib/python2.7/site-packages/mock/mock.py", line 1305, in patched
    return func(*args, **keywargs)
  File "/mnt/commcare-hq-ro/corehq/apps/locations/tests/test_location_fixtures.py", line 171, in test_all_locations_flag_returns_all_locations
    'Somerville', 'New York', 'New York City', 'Manhattan', 'Queens', 'Brooklyn']
  File "/vendor/lib/python2.7/site-packages/mock/mock.py", line 1305, in patched
    return func(*args, **keywargs)
  File "/mnt/commcare-hq-ro/corehq/apps/locations/tests/test_location_fixtures.py", line 86, in _assert_fixture_matches_file
    fixture = ElementTree.tostring(call_fixture_generator(generator, self.user)[-1])
  File "/usr/local/lib/python2.7/xml/etree/ElementTree.py", line 1126, in tostring
    ElementTree(element).write(file, encoding, method=method)
  File "/usr/local/lib/python2.7/xml/etree/ElementTree.py", line 817, in write
    self._root, encoding, default_namespace
  File "/usr/local/lib/python2.7/xml/etree/ElementTree.py", line 884, in _namespaces
    add_qname(tag)
  File "/usr/local/lib/python2.7/xml/etree/ElementTree.py", line 868, in add_qname
    qnames[qname] = encode(qname)
  File "/usr/local/lib/python2.7/xml/etree/ElementTree.py", line 843, in encode
    return text.encode(encoding)
UnicodeEncodeError: 'ascii' codec can't encode characters in position 0-1: ordinal not in range(128)
-------------------- >> begin captured logging << --------------------
corehq.apps.domain.deletion: INFO: Deleting cases...
corehq.apps.domain.deletion: INFO: Deleting cases complete.
corehq.apps.domain.deletion: INFO: Deleting forms...
corehq.apps.domain.deletion: INFO: Deleting forms complete.
--------------------- >> end captured logging << ---------------------

Guessing this one is intermittent (kicked it):

======================================================================
FAIL: corehq.apps.api.tests:TestCommCareCaseResource.test_get_list
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/vendor/lib/python2.7/site-packages/mock/mock.py", line 1305, in patched
    return func(*args, **keywargs)
  File "/mnt/commcare-hq-ro/corehq/apps/api/tests.py", line 432, in test_get_list
    self.assertEqual(response.status_code, 200)
AssertionError: 403 != 200

@esoergel esoergel force-pushed the es/location-upload branch from cac17a8 to 9e42d49 Compare July 24, 2018 19:44
@esoergel
Copy link
Contributor Author

Just implemented a proper fix for the caching issue that should fix the build: 9e42d49



from __future__ import unicode_literals
@quickcache(['domain', 'field_type'], timeout=60*60)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

E226 missing whitespace around arithmetic operator


from corehq.apps.custom_data_fields.dbaccessors import get_by_domain_and_type

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

F401 'corehq.apps.custom_data_fields.dbaccessors.get_by_domain_and_type' imported but unused

@esoergel esoergel merged commit 76b0f68 into master Jul 25, 2018
@esoergel esoergel deleted the es/location-upload branch July 25, 2018 15:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
product/all-users-all-environments Change impacts all users on all environments QA Passed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants