CHAIMS Scheduler Connan King, 6/8/98 Purpose of the Scheduler The scheduler program rearranges CHAIMS calls in a megaprogram at compile time to increase the megaprogram's performance. Three types of calls are handled by the scheduler: TERMINATE, EXTRACT, and INVOKE. TERMINATE: The scheduler inserts TERMINATE statements into the megaprogram so that a method called by INVOKE is terminated as soon as it is no longer needed by the megaprogram. This relieves the server of the burden of using storage space for the method for longer than is necessary, as well as ensuring that a method isn't left active on the server if the programmer fails to terminate it. EXTRACT: EXTRACT calls are moved within the megaprogram so that the EXTRACT is not issued until just before the data is needed so that the megaprogram spends as little time as possible sitting idle while waiting for the data to become available. INVOKE: INVOKE calls are arranged so that they are made as soon as possible, giving the method as much time as possible to execute. Scheduling Strategies: TERMINATE: The scheduler generates a TERMINATE statement for every call to INVOKE. The scheduler searches the megaprogram for the last usage of the invoke ID in a CHAIMS call, and places the terminate immediately after it. Special Considerations: 1) If the scheduler encounters an INVOKE call that makes use of a previously used invoke ID, it places a TERMINATE before that INVOKE statement so that the earlier invocation isn't "orphaned." This handles the situation where an invoke ID can be assigned multiple times within a WHILE loop. 2) If an IF or WHILE loop makes use of an invoke ID that also appears later in the megaprogram, a TERMINATE will be placed after the loop. 3) INVOKEs whose invokeIDs are not called later in the program are not terminated. If an INVOKE does not produce results (such as the print function in the I/O megamodule), there is no way to determine at what point the method should be terminated. The only safe way to terminate such a call would be to wait until the end of the megaprogram, and which point it will be terminated when the megamodule is terminated. EXTRACT: EXTRACT calls are moved so that any statements in the megaprogram that do not depend on the results of the EXTRACT are executed first. Special Considerations: EXTRACT calls are usually preceded by a WHILE loop which checks to see if the desired data is ready. The scheduler detects these and moves them along with the EXTRACT call. INVOKE: INVOKE calls are moved ahead of any program statements which the INVOKE doesn't depend on. That is, the INVOKE is executed as soon as all of the data required to make the call is available. Special Considerations 1) When an INVOKE relies on data from a previous EXTRACT and the two statements are separated by lines of non-dependent code, the scheduler must choose whether to move the EXTRACT down, or move the INVOKE up. In the current scheduler implementation, the EXTRACT statement is moved down due to the order in which the scheduler rearranges program statements. 2) When an INVOKE is called from within an IF loop, the scheduler could potentially move the call outside of the if so that it is executed early regardless of whether the IF is actually called. Calling the INVOKE outside the loop introduced the possibility that the server will execute a method whose results will never be used, but would also increase the likelihood that the data will be ready if the EXTRACT is called. Currently, the scheduler will not an INVOKE out of an IF loop unless an identical INVOKE statement also appears in the associated ELSE loop. This can be changed by setting the constant SERVER_COST to zero in the scheduler source code. For situation #1, in order to evaluate whether to move the EXTRACT or the INVOKE we need estimates for following execution times: execution time of the earlier invoke, execution time for the statements between the INVOKE and EXTRACT, time from EXTRACT to the dependent INVOKE, time to execute that INVOKE, and time from that INVOKE to the next EXTRACT. For situation #2, we need to know the cost of INVOKE (in terms of time, money, bandwidth) and the amount of time that we save if we move it out of the IF loop. These calculations would be complicated by the presence of conditional loops in the code and the possibility that some of the statements involved could be rescheduled, rendering the calculation invalid. Current limitations of the scheduler: The scheduler design is based on several assumptions about the megaprogram. 1) The program being scheduled is correct. Any syntactical errors in the program will cause the scheduler incorrectly parse the program. In addition, any errors in the original program may manifest themselves differently in the scheduled program. 2) The program is properly formatted. Statements within program loops must be enclosed in brackets, and statements must be on separate lines, since there is no character which indicates the end of a statement in CHAIMS. 3) When in doubt, do nothing. If the scheduler encounters a statement in the megaprogram that does not fit its understanding of the CHAIMS language, it will not schedule any statements around it. 4) The programmer will "play nice." If the program attempts to do anything that isn't explicitly allowed by the CHAIMS API, the scheduler may not work properly. EXAMPLES: make duplicates of invoke handles or megamodule handles, Here are some specific scenarios that cause problems with the scheduler. The scheduler may not work properly if a CHAIMS call other than EXAMINE is used as part of a conditional statement. Making a duplicate of an invokeID may cause errors in the megaprogram, as the original can be terminated while the duplicate is still in use. The scheduler will sometimes incorrectly determine that an INVOKE depends on a conditional loop, and will not schedule the INVOKE ahead of it.