Skip to content

Conversation

@psachdeva-ms
Copy link

This PR adds a new transformer: FileTransfer Transformer which is capable scp-ing files from one source environment vm to a target. Along with that there are minor changes to existing code for modularization and the implementation of core scp logic.

Add a validate method to check if the given files exist on the the local node.
Along with that enable the user to specify `files: ["*"]` in a runbook to upload
all the files from the source directory.

Signed-off-by: Piyush Sachdeva <[email protected]>
Signed-off-by: Piyush Sachdeva <[email protected]>
Instead of checking the existence of the target directory on the destination
directly from _internal_run method, do it in a new method: _check_dest_dir,
and call this from _internal_run.

Signed-off-by: Piyush Sachdeva <[email protected]>
Signed-off-by: Piyush Sachdeva <[email protected]>
@psachdeva-ms
Copy link
Author

@microsoft-github-policy-service agree company="Microsoft"

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR introduces a new FileTransferTransformer that enables file copying between remote nodes using SCP, with automatic fallback to local download/upload if direct SCP fails. The PR also refactors the existing FileUploaderTransformer to share common validation and directory-checking logic.

Changes:

  • Refactored FileUploaderTransformer to extract reusable methods (_check_dest_dir, _validate)
  • Added FileTransferTransformer for remote-to-remote file transfers with SCP support
  • Implemented copy_between_remotes method in RemoteCopy tool for direct node-to-node copying

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 17 comments.

File Description
lisa/transformers/file_uploader.py Refactored existing FileUploaderTransformer and added new FileTransferTransformer class that extends it to support remote-to-remote file transfers with SCP fallback mechanism
lisa/tools/remote_copy.py Added copy_between_remotes method to enable direct SCP-based file transfers between remote nodes, with SSH key management

Test Suggestions (Required per LISA Guidelines):

Key Test Cases:
Since this PR introduces new transformer functionality for file transfers, the following integration tests should be run to validate the changes:

  • Tests that verify basic file transfer operations between nodes
  • Tests that validate SCP functionality and fallback mechanisms
  • Tests that check transformer dependency resolution and execution order

Impacted LISA Features:
The changes primarily impact file transfer capabilities and transformer infrastructure, so no specific LISA feature classes are directly affected by this core functionality change.

Tested Azure Marketplace Images:
To validate the file transfer functionality across different operating systems and architectures, test with a minimal representative set:

  • canonical ubuntu-24_04-lts server latest
  • redhat rhel 9_5 latest
  • debian debian-12 12 latest

def _validate(self):
runbook: FileTransferTransformerSchema = self.runbook
source = runbook.source
src_node = quick_connect(runbook.source_node, runbook.name)
Copy link

Copilot AI Jan 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variable name "src_node" is created in _validate but never stored as an instance variable, meaning a new connection is created again in _internal_run. This is inefficient and could lead to issues. Consider storing the connected node as an instance variable to avoid reconnecting twice.

Suggested change
src_node = quick_connect(runbook.source_node, runbook.name)
# Cache the connected source node on the instance to avoid reconnecting.
self._src_node = quick_connect(runbook.source_node, runbook.name)
src_node = self._src_node

Copilot uses AI. Check for mistakes.
Comment on lines 218 to 219
f"remote-to-remote: '{file_name}' to \
'{str(dest_purepath)}'"
Copy link

Copilot AI Jan 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The log message contains a line break escape sequence within the f-string that results in excessive whitespace. The string should be written on a single line or use proper multi-line string formatting to avoid the tabs in the logged output.

Suggested change
f"remote-to-remote: '{file_name}' to \
'{str(dest_purepath)}'"
f"remote-to-remote: '{file_name}' to '{str(dest_purepath)}'"

Copilot uses AI. Check for mistakes.
self._log.debug(
f"scp command attributes:\n"
f"destination user: {dest_user}\n"
f"destination IPadress: {dest_addr}\n"
Copy link

Copilot AI Jan 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The typo "IPadress" should be "IP address" in the debug log message.

Suggested change
f"destination IPadress: {dest_addr}\n"
f"destination IP address: {dest_addr}\n"

Copilot uses AI. Check for mistakes.
default_factory=schema.RemoteNode,
metadata=field_metadata(required=True),
)
# local path to store kernel file
Copy link

Copilot AI Jan 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment says "local path to store kernel file" but this transformer is for general file transfer, not kernel-specific transfers. The comment should describe a generic local path for temporary storage during transfers.

Suggested change
# local path to store kernel file
# optional local path for temporary storage during file transfers

Copilot uses AI. Check for mistakes.
Comment on lines 152 to 179
def _validate(self):
runbook: FileTransferTransformerSchema = self.runbook
source = runbook.source
src_node = quick_connect(runbook.source_node, runbook.name)
ls = src_node.tools[Ls]

if not ls.path_exists(source):
raise ValueError(f"source {source} doesn't exist on remote node.")

self._runbook_files: List[str] = runbook.files
if self._runbook_files == ["*"]:
self._runbook_files = []
files = ls.list(source, sudo=True)
if len(files) == 0:
self._log.debug("No files to transfer.")
for file in files:
self._runbook_files.append(PurePath(file).name)

for file in self._runbook_files:
assert ls.path_exists(
f"{source}/{file}"
), f"Node does not contain file: {file}"

Copy link

Copilot AI Jan 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The _validate method is called during _initialize without checking if self._node is available. However, the validate method for FileTransferTransformer accesses a remote node via quick_connect, which may fail or cause issues if called during initialization before the target node is properly set up. Consider whether validation should be deferred or if remote node checks should be moved to _internal_run.

Copilot uses AI. Check for mistakes.
Comment on lines 240 to 242
src_node,
src_path: PurePath,
dest_node,
Copy link

Copilot AI Jan 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The method parameters lack type hints. The parameters src_node, src_path, dest_node, dest_path, and recurse should have proper type annotations for better code clarity and type checking. Based on the TYPE_CHECKING import, src_node and dest_node should likely be typed as "Node".

Suggested change
src_node,
src_path: PurePath,
dest_node,
src_node: "Node",
src_path: PurePath,
dest_node: "Node",

Copilot uses AI. Check for mistakes.
self._runbook_files.append(file)

for file in self._runbook_files:
assert os.path.exists(source / file), f"Node does not contain file: {file}"
Copy link

Copilot AI Jan 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error message in the assertion is misleading. The message says "Node does not contain file" but this is checking the local filesystem in the FileUploaderTransformer context, not a remote node. Consider changing to "Source path does not contain file" or "File does not exist in source path".

Suggested change
assert os.path.exists(source / file), f"Node does not contain file: {file}"
assert os.path.exists(
source / file
), f"File does not exist in source path: {source / file}"

Copilot uses AI. Check for mistakes.
)
# local path to store kernel file
local_path: Optional[str] = ""
# flag to avoid wasting trying scp
Copy link

Copilot AI Jan 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment says "flag to avoid wasting trying scp" which contains grammatical errors. It should be "flag to avoid wasting time trying scp" or "flag to skip scp attempts".

Suggested change
# flag to avoid wasting trying scp
# flag to skip scp attempts

Copilot uses AI. Check for mistakes.
from dataclasses_json import dataclass_json

from lisa import schema
from lisa.node import Node, quick_connect
Copy link

Copilot AI Jan 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Import of 'Node' is not used.

Suggested change
from lisa.node import Node, quick_connect
from lisa.node import quick_connect

Copilot uses AI. Check for mistakes.
from lisa.tools.mkdir import Mkdir
from lisa.tools.rm import Rm
from lisa.tools.whoami import Whoami
from lisa.util import constants
Copy link

Copilot AI Jan 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Import of 'constants' is not used.

Suggested change
from lisa.util import constants

Copilot uses AI. Check for mistakes.
@psachdeva-ms psachdeva-ms force-pushed the file_transfer_transformer branch 2 times, most recently from 7889948 to 5083939 Compare January 23, 2026 08:46
dest_key = dest_connection[constants.ENVIRONMENTS_NODES_REMOTE_PRIVATE_KEY_FILE]
if not dest_user or not dest_addr or not dest_key:
self._log_error("destination node connection info inaccessible.")
return -1
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

throw exception here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes more sense, will make the change and push.
Thanks!

)

# Ensure to delete the ssh-key file for security
src_node.tools[Rm].remove_file(path=tmp_key_name, sudo=False)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably we should make sure the above execute_async has been executed successfully, then remove the key file

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we not get rid of the file irrespectively, as there is no retry, and its better to delete the file for security?

return -1

# Use a temporary file for the key on src_node
tmp_key_name = "/tmp/.lisa_dest_key"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if there are multiple run against the same environment, the same key file name may be a problem.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file will be on the destination/target node.
The thought is that the node will get deleted after a successful run, even if not, the file will get deleted after each successful scp command.

`copy_between_remotes` method does scp of a file[s] (recursively) from source
node to destination node. The scp command is executed on the source node. If the
destination node is an azure vm, make sure to add the IP address of the source
node in the list of `source_address_prefixes` of the destination node for the
command to succeed.

Signed-off-by: Piyush Sachdeva <[email protected]>
Signed-off-by: Piyush Sachdeva <[email protected]>
The new file_transfer transformer uses `copy_between_remotes` method to scp
file[s] from source to destination node. Its validate method verifies the
existence of files on the source node and also has the functionality to transfer
all files from source directory on source node using `files: ["*"]`.
In case the `copy_between_remotes` fails, there is also a provision to first
save files from source node to local, and then upload from the local node to
destination. For this to work, the user should set the `local_path` variable.

Signed-off-by: Piyush Sachdeva <[email protected]>
Signed-off-by: Piyush Sachdeva <[email protected]>
@psachdeva-ms psachdeva-ms force-pushed the file_transfer_transformer branch from 5083939 to b40cd66 Compare January 28, 2026 07:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants