Conversation
f39226d to
95132cb
Compare
66e1fa4 to
cfe3567
Compare
|
How come the CIs are failing? Is there an unmet dependency somewhere? |
|
Bad merge. Hopefully fixed now. |
154bf81 to
d89520a
Compare
sumpy/expansion/local.py
Outdated
| f"A direct loopy kernel for translation from " | ||
| f"{src_expansion} to {self} is not implemented.") | ||
|
|
||
| def get_loopy_evaluator(self, kernels: Sequence[Kernel]) -> lp.TranslationUnit: |
There was a problem hiding this comment.
This should be named consistently with the above. How about loopy_evaluate?
There was a problem hiding this comment.
Currently we have a few names that are inconsistent with each other.
get_loopy_expansion_formation (for p2e)
get_loopy_evaluator (e2p)
loopy_translate_from (m2l)
preprocess_multipole_loopy_knl
postprocess_local_loopy_knl
Any suggestions on which to use?
There was a problem hiding this comment.
I think I would prefer loopy_purpose_of_thing. No get. But the name should indicate whether optimizations are also being returned.
There was a problem hiding this comment.
Btw, I would not be opposed to cleaning up the inconsistencies in this PR.
There was a problem hiding this comment.
loopy_evaluate_with_optimizations?
There was a problem hiding this comment.
That doesn't work: it reads as "evaluate with optimizations", which is not what this is. Maybe making it loopy_noun_and_optimizations, so loopy_evaluator_and_optimizations might be better?
| # DifferentiatedExprDerivativeTaker and sympy expressions, so we need to | ||
| # make the taker a DifferentitatedExprDerivativeTaker instance. | ||
| base_taker = DifferentiatedExprDerivativeTaker(base_taker, | ||
| {tuple([0]*self.dim): 1}) |
There was a problem hiding this comment.
Could you explain the background to this change?
There was a problem hiding this comment.
This is not true anymore. AxisTargetDerivative.postprocess_at_target and DirectionalTargetDerivative.postprocess_at_target now handles ExprDerivativeTaker.
sumpy/e2p.py
Outdated
|
|
||
| @memoize_method | ||
| def get_cached_loopy_knl_and_optimizations(self): | ||
| return self.expansion.get_loopy_evaluator(self.kernels) |
There was a problem hiding this comment.
It seems that this would simply fail if E2P were used for M2P, right?
There was a problem hiding this comment.
No, there's an implementation for M2P as well. (Just not super optimized)
sumpy/e2p.py
Outdated
| pass | ||
|
|
||
| @memoize_method | ||
| def get_cached_loopy_knl_and_optimizations(self): |
There was a problem hiding this comment.
I don't like the naming of this, there's already get_kernel. The name should reflect that we're talking about the evaluator. Plus whether or not something is memoized isn't typically reflected in its name.
sumpy/expansion/local.py
Outdated
| f"A direct loopy kernel for translation from " | ||
| f"{src_expansion} to {self} is not implemented.") | ||
|
|
||
| def get_loopy_evaluator(self, kernels: Sequence[Kernel]) -> lp.TranslationUnit: |
There was a problem hiding this comment.
I think I would prefer loopy_purpose_of_thing. No get. But the name should indicate whether optimizations are also being returned.
sumpy/expansion/local.py
Outdated
| f"A direct loopy kernel for translation from " | ||
| f"{src_expansion} to {self} is not implemented.") | ||
|
|
||
| def get_loopy_evaluator(self, kernels: Sequence[Kernel]) -> lp.TranslationUnit: |
There was a problem hiding this comment.
Btw, I would not be opposed to cleaning up the inconsistencies in this PR.
| def loopy_evaluator(self, kernels: Sequence[Kernel]) -> lp.TranslationUnit: | ||
| def loopy_evaluator_and_optimizations(self, kernels: Sequence[Kernel]) \ | ||
| -> Tuple[lp.TranslationUnit, Sequence[ | ||
| Callable[[lp.TranslationUnit], lp.TranslationUnit]]]: |
There was a problem hiding this comment.
- One optimization callable might suffice.
loopy_evaluator_and_transform?- Consistency with
get_inner_knl_and_optimizations?
(Don't feel strongly about any of this.)
| make_e2p_loopy_kernel) | ||
| try: | ||
| return make_l2p_loopy_kernel_for_volume_taylor(self, kernels) | ||
| except NotImplementedError: |
There was a problem hiding this comment.
Make a sub-brand of NotImplementedError to make sure this is specific.
|
|
||
| slowest_axis = axis_permutation[0] | ||
| c = max_mi[slowest_axis] | ||
| v = [pymbolic.var(f"x{i}") for i in range(dim)] |
There was a problem hiding this comment.
- I think it'd be better if these were named
i<something>. - Name the "vector" variable similar to the actual iname names, to
avoidminimize confusion.
| for deriv_id in deriv_id_to_coeff]) | ||
|
|
||
| def get_domains(v, iorder, with_sync): | ||
| domains = [f"{{ [{x0}_outer]: 0<={x0}_outer<={order//c} }}"] |
There was a problem hiding this comment.
| domains = [f"{{ [{x0}_outer]: 0<={x0}_outer<={order//c} }}"] | |
| """ | |
| :param with_sync: Whether to expose a loop nesting level that | |
| is finer-grained than order, for synchronization purposes. | |
| """ | |
| domains = [f"{{ [{x0}_outer]: 0<={x0}_outer<={order//c} }}"] |
| # the previous c rows set coeffs_copy[p-1, :] | ||
| # and then read from coeffs_copy[p, :]. |
| result = 0 | ||
| for mi, coeff in expr_dict.items(): | ||
| result += coeff * self._diff(expr, bvec, mi) | ||
| return result |
There was a problem hiding this comment.
return sum(...) (also for the source version)
| """ | ||
| return expr_dict | ||
|
|
||
| def get_derivative_coeff_dict_at_target(self, expr_dict): |
| """ | ||
| return expr_dict | ||
|
|
||
| def get_derivative_coeff_dict_at_target(self, expr_dict): |
There was a problem hiding this comment.
- Describe better what the multi-indices on input and output side.
- Generate the source/target docstrings from one source via
format. - Name:
apply_target_transformation_to_derivative_coeff_dict - Maybe make an explicit type that wraps the
expr_dictas you suggested.
| r"""Get the derivative transformation of the expression at target | ||
| represented by the dictionary expr_dict which is mapping from multi-index | ||
| `mi` to coefficient `coeff`. | ||
| Expression represented by the dictionary `expr_dict` is |
There was a problem hiding this comment.
| Expression represented by the dictionary `expr_dict` is | |
| The expression represented by the dictionary `expr_dict` is |
| knl = lp.tag_inames(knl, {"itgt_box": "g.0"}) | ||
| def get_optimized_kernel(self, max_ntargets_in_one_box): | ||
| _, optimizations = self.get_loopy_evaluator_and_optimizations() | ||
| knl = self.get_kernel(max_ntargets_in_one_box=max_ntargets_in_one_box) |
There was a problem hiding this comment.
Add a comment to explain the idea?
|
Unsubscribing... @-mention or request review once it's ready for a look or needs attention. |
Depends on #153
Depends on inducer/loopy#742
For compressed Taylor series L2P, we optimize the calculation by calculating the uncompressed coefficients from compressed ones in parallel using work items in a group. Let's say we are compressed using z axis and only
z=0, 1are there, we calculate the parts of the Taylor series for z=0, 1 first. Then,z=2, 3fromz=0, 1z=4, 5fromz=2, 3For biharmonic 2D, it's slightly different. We use
z=0,1,2,3to first calculate the parts of the Taylor series, thenz=4, 5fromz=0, 1, 2, 3z=4, 5intoz=0, 1storagez=6, 7fromz=2, 3, 4, 5z=6, 7intoz=2, 3storagez=8, 9fromz=4, 5, 6, 7z=8, 9intoz=4, 5storagez=10, 11fromz=6, 7, 8, 9z=10, 11intoz=6, 7storage