-
Notifications
You must be signed in to change notification settings - Fork 226
File transfer transformer #4235
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
File transfer transformer #4235
Conversation
Add a validate method to check if the given files exist on the the local node. Along with that enable the user to specify `files: ["*"]` in a runbook to upload all the files from the source directory. Signed-off-by: Piyush Sachdeva <[email protected]> Signed-off-by: Piyush Sachdeva <[email protected]>
Instead of checking the existence of the target directory on the destination directly from _internal_run method, do it in a new method: _check_dest_dir, and call this from _internal_run. Signed-off-by: Piyush Sachdeva <[email protected]> Signed-off-by: Piyush Sachdeva <[email protected]>
|
@microsoft-github-policy-service agree company="Microsoft" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR introduces a new FileTransferTransformer that enables file copying between remote nodes using SCP, with automatic fallback to local download/upload if direct SCP fails. The PR also refactors the existing FileUploaderTransformer to share common validation and directory-checking logic.
Changes:
- Refactored FileUploaderTransformer to extract reusable methods (_check_dest_dir, _validate)
- Added FileTransferTransformer for remote-to-remote file transfers with SCP support
- Implemented copy_between_remotes method in RemoteCopy tool for direct node-to-node copying
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 17 comments.
| File | Description |
|---|---|
| lisa/transformers/file_uploader.py | Refactored existing FileUploaderTransformer and added new FileTransferTransformer class that extends it to support remote-to-remote file transfers with SCP fallback mechanism |
| lisa/tools/remote_copy.py | Added copy_between_remotes method to enable direct SCP-based file transfers between remote nodes, with SSH key management |
Test Suggestions (Required per LISA Guidelines):
Key Test Cases:
Since this PR introduces new transformer functionality for file transfers, the following integration tests should be run to validate the changes:
- Tests that verify basic file transfer operations between nodes
- Tests that validate SCP functionality and fallback mechanisms
- Tests that check transformer dependency resolution and execution order
Impacted LISA Features:
The changes primarily impact file transfer capabilities and transformer infrastructure, so no specific LISA feature classes are directly affected by this core functionality change.
Tested Azure Marketplace Images:
To validate the file transfer functionality across different operating systems and architectures, test with a minimal representative set:
canonical ubuntu-24_04-lts server latestredhat rhel 9_5 latestdebian debian-12 12 latest
lisa/transformers/file_uploader.py
Outdated
| def _validate(self): | ||
| runbook: FileTransferTransformerSchema = self.runbook | ||
| source = runbook.source | ||
| src_node = quick_connect(runbook.source_node, runbook.name) |
Copilot
AI
Jan 22, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The variable name "src_node" is created in _validate but never stored as an instance variable, meaning a new connection is created again in _internal_run. This is inefficient and could lead to issues. Consider storing the connected node as an instance variable to avoid reconnecting twice.
| src_node = quick_connect(runbook.source_node, runbook.name) | |
| # Cache the connected source node on the instance to avoid reconnecting. | |
| self._src_node = quick_connect(runbook.source_node, runbook.name) | |
| src_node = self._src_node |
lisa/transformers/file_uploader.py
Outdated
| f"remote-to-remote: '{file_name}' to \ | ||
| '{str(dest_purepath)}'" |
Copilot
AI
Jan 22, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The log message contains a line break escape sequence within the f-string that results in excessive whitespace. The string should be written on a single line or use proper multi-line string formatting to avoid the tabs in the logged output.
| f"remote-to-remote: '{file_name}' to \ | |
| '{str(dest_purepath)}'" | |
| f"remote-to-remote: '{file_name}' to '{str(dest_purepath)}'" |
lisa/tools/remote_copy.py
Outdated
| self._log.debug( | ||
| f"scp command attributes:\n" | ||
| f"destination user: {dest_user}\n" | ||
| f"destination IPadress: {dest_addr}\n" |
Copilot
AI
Jan 22, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The typo "IPadress" should be "IP address" in the debug log message.
| f"destination IPadress: {dest_addr}\n" | |
| f"destination IP address: {dest_addr}\n" |
lisa/transformers/file_uploader.py
Outdated
| default_factory=schema.RemoteNode, | ||
| metadata=field_metadata(required=True), | ||
| ) | ||
| # local path to store kernel file |
Copilot
AI
Jan 22, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment says "local path to store kernel file" but this transformer is for general file transfer, not kernel-specific transfers. The comment should describe a generic local path for temporary storage during transfers.
| # local path to store kernel file | |
| # optional local path for temporary storage during file transfers |
lisa/transformers/file_uploader.py
Outdated
| def _validate(self): | ||
| runbook: FileTransferTransformerSchema = self.runbook | ||
| source = runbook.source | ||
| src_node = quick_connect(runbook.source_node, runbook.name) | ||
| ls = src_node.tools[Ls] | ||
|
|
||
| if not ls.path_exists(source): | ||
| raise ValueError(f"source {source} doesn't exist on remote node.") | ||
|
|
||
| self._runbook_files: List[str] = runbook.files | ||
| if self._runbook_files == ["*"]: | ||
| self._runbook_files = [] | ||
| files = ls.list(source, sudo=True) | ||
| if len(files) == 0: | ||
| self._log.debug("No files to transfer.") | ||
| for file in files: | ||
| self._runbook_files.append(PurePath(file).name) | ||
|
|
||
| for file in self._runbook_files: | ||
| assert ls.path_exists( | ||
| f"{source}/{file}" | ||
| ), f"Node does not contain file: {file}" | ||
|
|
Copilot
AI
Jan 22, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The _validate method is called during _initialize without checking if self._node is available. However, the validate method for FileTransferTransformer accesses a remote node via quick_connect, which may fail or cause issues if called during initialization before the target node is properly set up. Consider whether validation should be deferred or if remote node checks should be moved to _internal_run.
lisa/tools/remote_copy.py
Outdated
| src_node, | ||
| src_path: PurePath, | ||
| dest_node, |
Copilot
AI
Jan 22, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The method parameters lack type hints. The parameters src_node, src_path, dest_node, dest_path, and recurse should have proper type annotations for better code clarity and type checking. Based on the TYPE_CHECKING import, src_node and dest_node should likely be typed as "Node".
| src_node, | |
| src_path: PurePath, | |
| dest_node, | |
| src_node: "Node", | |
| src_path: PurePath, | |
| dest_node: "Node", |
lisa/transformers/file_uploader.py
Outdated
| self._runbook_files.append(file) | ||
|
|
||
| for file in self._runbook_files: | ||
| assert os.path.exists(source / file), f"Node does not contain file: {file}" |
Copilot
AI
Jan 22, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error message in the assertion is misleading. The message says "Node does not contain file" but this is checking the local filesystem in the FileUploaderTransformer context, not a remote node. Consider changing to "Source path does not contain file" or "File does not exist in source path".
| assert os.path.exists(source / file), f"Node does not contain file: {file}" | |
| assert os.path.exists( | |
| source / file | |
| ), f"File does not exist in source path: {source / file}" |
lisa/transformers/file_uploader.py
Outdated
| ) | ||
| # local path to store kernel file | ||
| local_path: Optional[str] = "" | ||
| # flag to avoid wasting trying scp |
Copilot
AI
Jan 22, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment says "flag to avoid wasting trying scp" which contains grammatical errors. It should be "flag to avoid wasting time trying scp" or "flag to skip scp attempts".
| # flag to avoid wasting trying scp | |
| # flag to skip scp attempts |
| from dataclasses_json import dataclass_json | ||
|
|
||
| from lisa import schema | ||
| from lisa.node import Node, quick_connect |
Copilot
AI
Jan 22, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Import of 'Node' is not used.
| from lisa.node import Node, quick_connect | |
| from lisa.node import quick_connect |
| from lisa.tools.mkdir import Mkdir | ||
| from lisa.tools.rm import Rm | ||
| from lisa.tools.whoami import Whoami | ||
| from lisa.util import constants |
Copilot
AI
Jan 22, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Import of 'constants' is not used.
| from lisa.util import constants |
7889948 to
5083939
Compare
lisa/tools/remote_copy.py
Outdated
| dest_key = dest_connection[constants.ENVIRONMENTS_NODES_REMOTE_PRIVATE_KEY_FILE] | ||
| if not dest_user or not dest_addr or not dest_key: | ||
| self._log_error("destination node connection info inaccessible.") | ||
| return -1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
throw exception here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes more sense, will make the change and push.
Thanks!
| ) | ||
|
|
||
| # Ensure to delete the ssh-key file for security | ||
| src_node.tools[Rm].remove_file(path=tmp_key_name, sudo=False) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
probably we should make sure the above execute_async has been executed successfully, then remove the key file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we not get rid of the file irrespectively, as there is no retry, and its better to delete the file for security?
| return -1 | ||
|
|
||
| # Use a temporary file for the key on src_node | ||
| tmp_key_name = "/tmp/.lisa_dest_key" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if there are multiple run against the same environment, the same key file name may be a problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file will be on the destination/target node.
The thought is that the node will get deleted after a successful run, even if not, the file will get deleted after each successful scp command.
`copy_between_remotes` method does scp of a file[s] (recursively) from source node to destination node. The scp command is executed on the source node. If the destination node is an azure vm, make sure to add the IP address of the source node in the list of `source_address_prefixes` of the destination node for the command to succeed. Signed-off-by: Piyush Sachdeva <[email protected]> Signed-off-by: Piyush Sachdeva <[email protected]>
The new file_transfer transformer uses `copy_between_remotes` method to scp file[s] from source to destination node. Its validate method verifies the existence of files on the source node and also has the functionality to transfer all files from source directory on source node using `files: ["*"]`. In case the `copy_between_remotes` fails, there is also a provision to first save files from source node to local, and then upload from the local node to destination. For this to work, the user should set the `local_path` variable. Signed-off-by: Piyush Sachdeva <[email protected]> Signed-off-by: Piyush Sachdeva <[email protected]>
5083939 to
b40cd66
Compare
This PR adds a new transformer: FileTransfer Transformer which is capable scp-ing files from one source environment vm to a target. Along with that there are minor changes to existing code for modularization and the implementation of core scp logic.