0%

SystemML Workflow

Create DML script

  • Use org.apache.sysml.api.mlcontex.Script class to create DML script. The in() and out() functions are utilized to map the input and output values

Handling DML

/org/apache/sysml/api/mlcontext/ScriptExecutor.class

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
public MLResults execute(Script script) {
this.setup(script);
this.parseScript();
this.liveVariableAnalysis();
this.validateScript();
this.constructHops();
this.rewriteHops();
this.rewritePersistentReadsAndWrites();
this.constructLops();
this.generateRuntimeProgram();
this.showExplanation();
this.globalDataFlowOptimization();
this.countCompiledMRJobsAndSparkInstructions();
this.initializeCachingAndScratchSpace();
this.cleanupRuntimeProgram();

try {
this.createAndInitializeExecutionContext();
this.executeRuntimeProgram();
} finally {
this.cleanupAfterExecution();
}

MLResults mlResults = new MLResults(script);
script.setResults(mlResults);
return mlResults;
}
  • More time are needed to understand

    • constructHops()
    • rewriteHops()
    • countCompiledMRJobsAndSparkInstructions()

Excute DML script

/org/apache/sysml/api/mlcontext/ScriptExecutor.class
this.executeRuntimeProgram() :

  • Script: org.apache.sysml.api.ScriptExecutorUtils.executeRuntimeProgram(ScriptExecutor se, int statisticsMaxHeavyHitters)
  • Program: org.apache.sysml.runtime.controlprogram.Program.execute(ExecutionContext ec)
  • ProgramBlock: org.apache.sysml.runtime.controlprogram.ProgramBlock#execute(ExecutionContext ec)
    • recompile code if the context is set to dynamically compile:
      • tmp = Recompiler.recompileHopsDag(this._sb, this._sb.get_hops(), ec.getVariables(), (RecompileStatus)null, false, true, this._tid);
      • org.apache.sysml.hops.recompile.Recompiler#recompileHopsDag(org.apache.sysml.parser.StatementBlock, java.util.ArrayList<org.apache.sysml.hops.Hop>, org.apache.sysml.runtime.controlprogram.LocalVariableMap, org.apache.sysml.hops.recompile.RecompileStatus, boolean, boolean, long)
      • org.apache.sysml.parser.DMLProgram#createRuntimeProgramBlock
      • org.apache.sysml.lops.compile.Dag#doGreedyGrouping
        • org.apache.sysml.lops.compile.Dag#deleteUpdatedTransientReadVariables
        • org.apache.sysml.lops.compile.Dag#generateRemoveInstructions
        • org.apache.sysml.lops.compile.Dag#generateInstructionsForInputVariables
        • org.apache.sysml.lops.compile.Dag#generateControlProgramJobs
  • Instructions:
    • org.apache.sysml.runtime.controlprogram.ProgramBlock#executeInstructions
    • org.apache.sysml.runtime.controlprogram.ProgramBlock#executeSingleInstruction
      • org.apache.sysml.runtime.instructions.cp.VariableCPInstruction#processInstruction create the variables in the type of MatrixObject
      • org.apache.sysml.runtime.matrix.data.LibMatrixDatagen#createRandomMatrixGenerator
      • org.apache.sysml.runtime.instructions.cp.MatrixBuiltinCPInstruction#processInstruction

Parfor

org.apache.sysml.api.mlcontext.MLContext ->
org.apache.sysml.api.mlcontext.ScriptExecutor ->
org.apache.sysml.api.ScriptExecutorUtils ->
org.apache.sysml.runtime.controlprogram.Program ->
org.apache.sysml.runtime.controlprogram.ProgramBlock ->
org.apache.sysml.runtime.instructions.cp.FunctionCallCPInstruction ->
org.apache.sysml.runtime.controlprogram.FunctionProgramBlock ->
org.apache.sysml.runtime.controlprogram.ParForProgramBlock ->
org.apache.sysml.runtime.controlprogram.parfor.RemoteParForSpark ->
org.apache.sysml.runtime.controlprogram.parfor.RemoteParForSparkWorker

TaskPartitioner for how tasks are created and
ParWorker for how those tasks are eventually executed.