-
-
Notifications
You must be signed in to change notification settings - Fork 224
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Location upload refactor #21092
Location upload refactor #21092
Conversation
Many of these test methods used the same setup, so I split these up into two classes - one with setup and one without. I also converted the one with setup to use the more standard LocationHierarchyPerTest class rather than its reimplementation of it.
Now there's a namedtuple to represent the raw data and a helper class for computing the upload. The latter no longer mutates.
This sets parent at the very last minute, which will let us simplify some of the other code considerably. It also allows for the removal of the fake_stub class, but I'm gonna commit that separately to avoid conflicts.
This also takes the non-db tests and makes them db tests - there was a bunch of mocking necessary to make them run, and it felt counter-productive. I couldn't reasonably split this into multiple commits with the tests still passing, sorry for the massive commit!
This was failing after the new ID validation
@@ -384,10 +296,11 @@ def validate_and_parse_stubs_from_excel(self): | |||
extra=", ".join(actual - expected), | |||
)) | |||
|
|||
type_stubs = self._get_types(type_sheet_reader) | |||
type_data = [self._get_type_data(index, row) | |||
for index, row in enumerate(type_sheet_reader)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
E127 continuation line over-indented for visual indent
self.domain = domain | ||
self.domain_obj = Domain.get_by_name(domain) | ||
self.type_rows = type_rows | ||
self.location_rows = location_rows | ||
self.user = user |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is used in a later commit
@@ -609,21 +573,284 @@ def test_data_format(self): | |||
) | |||
assert_errors(result, ['index 0 should be valid decimal numbers']) | |||
|
|||
def test_custom_data(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test and those below it are just being moved from that class into this one
location = old_locations_by_pk[pk] | ||
yield location.location_id | ||
pk = location.parent_id | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made a comment on the commit, but it didn't show up in diff view, so I'll copy here:
This change is twofold:
It's difficult to assign a parent location at the time when the location stub is being created. You'd need to either reorder locations to be top-to-bottom first (which is hard, since they're not yet fully parsed), or do two passes, one to init everything, then a second to add parents. I didn't like the second option since you'd have to mutate the location after its construction. Luckily, it looked like very little actually requires the parent object - pretty much just at the point of saving, so I opted to move the lookup there and drop the linkage.
The second aspect of this is that I realized we should be looking for the old ancestors here anyways, since they're the ones that are affected by a deletion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks Ethan... Can you add couple of comments within save_locations
to understand whats happening here better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great! Left a few comments.
# check if any attributes are being updated | ||
for attr in self.meta_data_attrs: | ||
if getattr(old_version, attr, None) != getattr(self, attr, None): | ||
if getattr(self.old_object, attr, None) != getattr(self.new_data, attr, None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like both old and new should have the attribute, and if not it's a bug. Would it be safe to remove the default (None
) from these getattr
calls?
def UpdateLocRow(self, name, site_code, location_type, parent_code, | ||
location_id=Ellipsis, **kwargs): | ||
"""Like NewLocRow, but looks up location_id from the site_code""" | ||
if location_id != Ellipsis: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since you should never pass location_id
to this function, would it make sense to remove the location_id
argument from UpdateLocRow
and change this conditional to if "location_id" in kwargs:
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason I did it this way was because NewLocRow
and UpdateRow
take the same args, and the test cases pass in name, site_code, location_type, parent_code, location_id
as positional arguments. However, I think I've now removed the location_id
from all of the args, or at least it doesn't deserve positional status anymore. I'll look into it.
for attr in self.meta_data_attrs: | ||
setattr(self.db_object, attr, getattr(self, attr, None)) | ||
setattr(obj, attr, getattr(self.new_data, attr, None)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can the default None
be removed from getattr
? Seems like it could mask a bug (mistyped attribute name in self.meta_data_attrs
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice, that's a great idea, will update.
.format(loc_stub.site_code)) | ||
else: | ||
parent_exists = (loc_stub.parent_code in self.old_collection.locations_by_site_code) | ||
if (parent_exists and loc_stub.parent_code not in accessible_site_codes): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this allow a location-restricted user to create a new root location?
nit: unnecessary parens on above two lines
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good thinking, I should probably add a test and branch specifically for that
if not loc_stub.needs_save: | ||
# Allow users to include any loc, as long as there are no changes | ||
continue | ||
if not loc_stub.is_new: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think of eliminating the negative (not
) and swapping the branches of this conditional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
corehq/apps/locations/views.py
Outdated
@@ -846,7 +846,10 @@ def post(self, request, *args, **kwargs): | |||
# We need to start this task after this current request finishes because this | |||
# request uses the lock_locations decorator which acquires the same lock that | |||
# the task will try to acquire. | |||
task = import_locations_async.apply_async(args=[domain, file_ref.download_id], countdown=10) | |||
task = import_locations_async.apply_async( | |||
args=[domain, file_ref.download_id, request.couch_user], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be better to just pass the User ID and look it up in Celery rather than pass the entire user object.
Got part way through this. Will try review the rest later |
Document saving a little more
This was causing an intermittent test failure. A regrettable side effect of doing this was that I had to drop the import statements from custom_data_fields.__init__. This appears to be because of this change from Django 1.9: > All models need to be defined inside an installed application or declare an explicit app_label. Furthermore, it isn’t possible to import them before their application is loaded. In particular, it isn’t possible to import models inside the root package of an application.
The whole string module is deprecated - you're supposed to use methods now :(
PR feedback from #21092
corehq/apps/locations/views.py
Outdated
# We need to start this task after this current request finishes because this | ||
# request uses the lock_locations decorator which acquires the same lock that | ||
# the task will try to acquire. | ||
task = import_locations_async.apply_async( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
F841 local variable 'task' is assigned to but never used
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this was a bad merge - now resolved
95d09f5
to
a0e4366
Compare
excel will convert these columns to numeric types if they happen to be composed of numerals
|
||
|
||
@attrs(frozen=True) | ||
class LocationData(object): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice change!
Not sure if this text failure is related to changes in this PR or not?
Guessing this one is intermittent (kicked it):
|
cac17a8
to
9e42d49
Compare
Just implemented a proper fix for the caching issue that should fix the build: 9e42d49 |
|
||
|
||
from __future__ import unicode_literals | ||
@quickcache(['domain', 'field_type'], timeout=60*60) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
E226 missing whitespace around arithmetic operator
|
||
from corehq.apps.custom_data_fields.dbaccessors import get_by_domain_and_type |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
F401 'corehq.apps.custom_data_fields.dbaccessors.get_by_domain_and_type' imported but unused
Product-facing changes
location_id
is now a required field in order to update a location. Previously, users had issues where they'd intend to add a new location, but the site_code they specified was already taken. The old behavior was to update the existing location that had that site_code, whereas now it'll fail. This ID requirement is consistent with user and case uploads.Notes for reviewers
This is a big PR, I'd recommend going commit-by-commit. There's one big commits, but most should be pretty digestible. Let me know if I can clarify anything. I'm treating this PR as a long-running feature branch, and PRing additions into this branch, not pushing directly to it anymore. The non user-facing changes in here are:
LocationTypeStub
andLocationStub
classes are now initialized a little later in the process, and require a data object and the old collection as arguments to the constructor. This allows for all methods and attributes to be effectively static. Previously, they were initialized as part of the parsing process, and later injected with the old collection, which made for a state model that I found difficult to reason with.LocationHierarchyPerTest
rather than the one-off setup methods (I don't think that had existed at the time this was written).NewLocRow
, which doesn't supply an ID, orself.UpdateLocRow
, which looks up the ID by the site code.@mkangia @millerdev @sravfeyn
QA request: https://manage.dimagi.com/default.asp?279327
Oh also, I'm not crazy about the term "stub", let me know if you've got a better idea for what to call those things.