-
Notifications
You must be signed in to change notification settings - Fork 316
feat: adds condition class and assoc. unit tests #2159
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: adds condition class and assoc. unit tests #2159
Conversation
"""str: The expression string for the condition.""" | ||
|
||
# Cast assumes expression is always set due to __init__ validation | ||
return typing.cast(str, self._properties.get("expression")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just for my education, why is typing.cast()
necessary for expression, but not title or description?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both mypy
and pytype
struggle with correctly assessing return types. In this case, both checkers think the return type should be Optional[Any]
when we expect (and validate for) a str
.
Despite the fact that we indicate via typehinting:
- that
expression
is astr
in the__init__()
method - the
expression
getter returns astr
- the
expression
setter method signature requires that value be input as astr
- the setter method has internal checks to ensure that the value stored in the
_properties
dict is astr
neither mypy
nor pytype
will believe it.
It is not inherently clear why they believe our typehinting for title
and description
but not expression
. I have come across this failure multiple times in adding objects to our repos.
There are approximately seven similar examples of this problem elsewhere in this file (that predate my taking over the repo) (and there are examples in other files in this codebase).
These are the error messages we see:
mypy
google/cloud/bigquery/dataset.py:1140: error: Incompatible return value type (got "Optional[Any]", expected "str") [return-value]
pytype
/repo/python-bigquery/google/cloud/bigquery/dataset.py:1140:1: error: in expression: bad option 'None' in return type [bad-return-type]
Expected: str
Actually returned: Optional[Any]
condition = Condition(expression=self.EXPRESSION) | ||
expected_api_repr = { | ||
"expression": self.EXPRESSION, | ||
"title": None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not very familiar with Condition, do we need to distinguish between title and description being not set versus being an empty string?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a pair of asserts to test_setters
to confirm the use case that assigning empty strings is allowed and succeeds.
I think there are two issues here:
-
this test is ensuring that if I try to create a
Condition
object using ONLY the requiredexpression
argument and leave the other arguments (title
,description
) blank, will it create the expected outcome: i.e. a create an object withNone
values assigned to bothtitle
anddescription
? -
Do we need to test how the setter for
title
anddescription
handle a range of values (None
,empty string
, andnon-empty string
, something besides the above, etc)
- In
test_setters
we confirm that we can reassign a new string to eithertitle
ordescription
AND we confirm that we can assign a new value ofNone
(WAI) - In
test_constructor_and_getters_full
we confirm that we can set anon-empty string
(WAI) - In
test_validation_setters
we confirm that something besides a string will fail (WAI)
Our logic does not hinge on whether a value assigned to title
or description
is Falsey OR not. We do not do a boolean check NOR do empty strings trigger some logic so there is no difference between assigning an empty string
vs a non-empty string
. I am not convinced that we need the test for empty strings, but I don't believe it will hurt us.
|
||
def to_api_repr(self) -> Dict[str, Any]: | ||
"""Construct the API resource representation of this Condition.""" | ||
return self._properties |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might want to make a deep copy of the dict, similar to other classes, such as Table.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unless we have a compelling reason to use deepcopy, I am disinclined. Thoughts?
In a previous experimental PR, Tim left a comment about some of the pitfalls of using deep copy, if not necessary.
It might be a bit wasteful to make deepcopy here and in `from_api_repr`. Indeed it's safer, but could add a lot of overhead. IIRC we actually removed some `deepcopy` calls from `SchemaField` because it was slowing down customers who build dynamic schemas in their code.
In all cases we are returning a dict (thus there should not be a significant risk of some underlying nested value being changed by another expression in the code)
I was curious about the overall cost of a deepcopy so I did a fairly simple experiment:
In [1]: import copy
In [2]: api_repr = {
...: "expression": "some_value",
...: "title": "some_value",
...: "unexpected_field": "some_value",
...: }
In [3]: _properties['condition'] = api_repr
From fastest to slowest:
Simply assign an alias to the dict
In [4]: %timeit new_prop = _properties
14.4 ns ± 0.0149 ns per loop (mean ± std. dev. of 7 runs, 100,000,000 loops each)
# NOTE: when we use `return _properties`, it is the same as using an alias.
Use the builtin copy method in the dict class
This creates a shallow copy of the dict
In [11]: %timeit new_prop = _properties.copy()
58.6 ns ± 0.228 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)
Create a copy.copy
This also creates a shallow copy of the dict
In [9]: %timeit new_prop = copy.copy(_properties)
145 ns ± 0.401 ns per loop (mean ± std. dev. of 7 runs, 10,000,000 loops each)
Create a copy.deepcopy
In [5]: %timeit new_prop = copy.deepcopy(_properties)
3 µs ± 16.7 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
Adds a Condition class and associated suite of tests.
🦕