insituterminology | Phase1A / Phase1AIdentifyingInSituUseCases

This page is dedicated to identifying the different types of in situ configurations. For now, all ideas are good ideas. If you can think of potential characteristics that are important for defining an in situ configuration, then please list it here. After we collect all the ideas, we will attempt to enumerate possible in situ configurations.

Please edit this page with your thoughts:

expand on existing ideas
1. take something at too coarse a granularity and add refinement
2. add missing perspectives, new approaches within a category
add new categories altogether
put comments in each section if you don't want to modify the text that is there, but still want to say something about it (positive or negative)
try not to remove ideas at this stage ... all ideas are good ideas

Access

On-node access to data
- direct access to primary memory
  - (AC) direct access to primary memory with L4 cache on NVM device
- no direct access to primary memory
  - access through CPU-accelerator sharing (e.g., Xeon to Knights Landing, CPU to GPU)
  - access through GPU-GPU sharing (NVLink)
  - access through data stored to local disk (on node), retrieved from local disk
NOT on-node access to data
- access via network transfer
  - central staging (i.e., single target node aggregates all the data)
  - distributed staging (e.g., n sim nodes transfer to 1 staging node)
Hybrid
- data reduction on-node; reduced intermediate results are processed somewhere else

Comments

Thoughts about the above:

(SDA) I'm thinking that we may want to separate the location of the data to be processed from how it's accessed since there may be more than one way to access data that resides somewhere. For example, saying that data lives on non-volatile memory attached to a node is a separate thing from saying that it's accessed as a local drive. Some NVM may be accessed through a separate API. Similarly, there is exploration of NVM attached to the interconnect fabric. Accessing that will be custom and should be discussed separately. Even further, sometimes data lives on a parallel file system, but that parallel file system (e.g., DDN) provides processors upon which data operations may be run. This is an even stranger type. Data location and data access are separate ideas.
(AB) should access via burst buffers or other file IO be considered for NOT on-node access
(GHW) I'd say burst buffer is NOT on-node access. Is BB covered by "local disk" or should it be a separate category?
- (SDA) If a burst buffer is directly attached to the node, then it should be considered on-node access, just as any other resource attached to the node is.
(VP) the Burst Buffer should normally part of specialized nodes that are not part of the compute nodes pool. Therefore it should not be considered part of the on-node case. Also there is a question regarding the practical availability (in terms of API) of the BB for computations. The vendors may make it available just for fast I/O even if they are capable of much more.
(CDH) Fuzzy Cases
- Memory Mapping - does it matter which file system, is memory mapping as a class not insitu?

(AK) -- I'd say it can be in situ; for example you can memory map from one process to another. In a sense that would still be "tightly-coupled" in situ, also.

Burst Buffers: As Andy notes, In some cases they are basically used as a parallel global file system. I am am not sure if "Burst Buffer" always indicates this form?
Does on-node shared memory (across processes) count as "direct access to primary memory", or not? Does a threading/ process model make a meaningful distinction, or is that another orthogonal choice?

(TF) does the CUDA IPC model warrant a distinction? CUDA IPC (or even IPC in general) allows one to write a separate *process* that still allows direct access to the simulation data.
(JC) What exactly is "primary memory"? I'm think of it as "RAM used by the simulation" where RAM could be main memory or device-specific memory...but I'm sort of making that up.
(JC) When I look at memory access, I think in terms of adding traffic. Therefore, I don't think the exact method of access (i.e., memory map or NVLink) is the right question, but "distance to access" is. Basing the terminology too tightly on exact technologies will drive it out-of-date quickly. The distance hierarchy (with probably terrible names) is:
- On resource (memory on a GPU)
- Dedicated link (CPU's own bank in NUMA, NVLink, main memory for some GPU's )
- On node, different device (GPU gets CPU memory, some other processor's memory, NVLINK access, etc)
- On node, file system
- Across the network
(GHW) I'd second this classification based on "distance to access"
(KM) This is sort of mentioned above, but under on-node/no direct access, a use case should be process-to-process communication. Different processes running on the same node have memory protected from each other. To share data they either have to manage a shared memory buffer or transfer messages.
(KM) Under NOT on-node memory access, NVRAM/burst buffer should be listed there. There are supercomputer configurations where the same NVRAM is accessible from multiple nodes, and this could be a mechanism of loose coupling.
(KM) A condition not explored here and rather missing from this whole document is the idea of "hybrid" access where one on-node in situ vis performs an initial memory reduction transformation, which is sent to a secondary off-node vis that performs further processing. This is also related to the What's Produced section below where something "explorable" (using the current terminology there) is created by the on-node and sent to the off-node vis to create something else or perhaps allow interactive vis.
- (HT) I second that and felt free to add a corresponding category (...for further refinement). IMHO, given the granularity of the "on-node" discussion above, most systems are likely to somehow end up being "hybrid", e.g. some processing is done on a GPU right next to the raw data, reduced results are transferred back to the (node's) CPU, from where the data is shipped somewhere else for further aggregation...
(WB) WRT burst buffers: that they are logically a deepening of the memory/storage hierarchy that are, in the present, accessible as a file system, and used as a file system that is "near" rather than "far" (like a spinning disk). In the future, I expect we may see them presented as an extension of the memory subsystem.
(WB) My comment above leads to this point: Our present view (in this discussion) is that there is a clear partitioning of on-node and off-node resources/access (with which I more or less agree); this clear partitioning may become fuzzy in the future and we may want to be mindful that the partitioning will become more fuzzy.
(HT) Motivated by the level of granularity for the on-node case, I added two (possible not neccessarily practical) scenarios for "off-node/in-transit/..." processing.
(HT) An aspect that may be of interest w.r.t. data access is how data is moved through the pipeline. Traditional vis pipelines mostly follow a pull model. However, it seems to me that most in situ processing is driven by a push from the simulation side: whenever data becomes available, it is driven through the pipeline. While this seems to be the natural mode of operation, there may be cases where it's not beneficial to force all the data through the entire process. I'm not entirely sure, however, whether this point belongs here.
(SZ) Related to the aforementioned fuzziness, do we want to consider different types of specialized nodes in the category of off-node access? Right now we seem to mostly call them staging nodes, but I might expect new node types like burst buffer nodes that are heavily involved but separate from the traditional staging nodes.
(SZ at HT) The push vs pull consideration seems related to the producer-consumer discussion in Other Considerations below.
(CH) On node/direct access: Are others considering the two distinct cases: (1) simulation data is directly accessible and read into a visualization dataspace (2) visualization methods operate directly on the simulation data structures.
(GHW) Distinction between "pure" processing of simulation results vs communicating results back to simulation so that it can use them later on?

Synchronous vs asynchronous

asynchronous: visualization/analysis & simulation occur concurrently, by sharing compute resources
- Examples:
  - 31 nodes on supercomputer for sim, 1 node for vis/analysis
  - 31 cores on one node of supercomputer for sim, 1 core for vis/analysis
  - sim running on GPU, vis runs on CPU (or vice versa)
synchronous (not asynchronous): processing power devoted exclusively to simulation OR visualization/analysis
(GHW) Re-balancing between number of cores for sim vs number of cores for analysis as the simulation runs?

Comments

TP: Access and synchronous/asynchronous are not orthogonal. Here is my taxonomy: in situ =

time division (synchronous)
space division (asynchronous)
- different thread or core in the same processor or node (shared memory access)
- different processor or node in the same machine (distributed memory access)
- different machine in the same facility (socket access)

Thoughts about the above:

(DP) time division could also be an asynchronous mode. Multiple time steps in main memory, burst buffer, or transfer to another machine. Am I understanding your taxonomy correctly?
(FS) Possible sub-categories we can consider:
- Vis/analysis between timesteps vs. waiting until after simulation finished, but desired data is still in memory. A sub-type of synchronous?
- Vis/analysis runs on separate nodes/cores/threads vs. on the same nodes/cores/threads that perform the simulation, i.e. zero separation between vis and sim code. A sub-type of asynchronous?
(TF) Or the same N cores could be used for both. By double-buffering, we could be analyzing/rendering/compositing timestep X while we are computing timestep X+1. When we say something is "[a]synchronous", are we saying something about independent processes/computations/nodes in a task graph, or are we saying something about resources? To me, it's the former.

What's produced

explorable
- images (e.g., ParaView Cinema)
- transformed data (e.g., Lagrangian basis flows, explorable images, topological descriptions)
- compressed data (e.g., wavelets, run length encoding, Peter Lindstrom's compression, etc.)
not-explorable
- Images
- Movies
- data analysis (e.g., a synthetic diagnostic)

Comments

Thoughts about the above:

TP: Various products can be listed as related work, but they don't affect how we think about the system. If you think about this as a workflow or task graph, there can be many intermediate products. Each node performs some transformation on the data, and the products are the links in the graph. The type of product is not at issue, I think.
JB: I like the workflow / task graph description. However, I do think that the type of product here is of interest in that the 'explorable' vs 'not-explorable' labels indicate whether the "downstream" or post-processing workflow is a single node (non-explorable) or if it is a task graph itself (explorable).
PTB: I agree with the first comment that the product should not matter for the in-situ classification / terminology. So what if I write out some hierarchical descriptor that I then post-process on a small cluster. I don't think this changes the nature of the in-situ workflow. In my mind everything that happens "after" the simulation is complete is post-process and thus not in-situ independent of whether it needs a single core or another full scale analysis.
JB: How much do we want to consider additional analysis use cases, such as UQ? There are some Sandia projects that use the term "embedded analysis" to denote what I believe this group would call "in situ ensembles". These use cases have more complex task graphs and the "driver" is not the simulation, rather there is a UQ driver that may run many forward simulations (each with some of its own vis/other data analysis).
KLM: UQ is an essential topic for in situ work.
JC: I agree the product is not necessary for the classification. Perhaps single- vs. multi-stage processing is important though as the number of intermediate phases may affect how it the in-situ program influences the main program. Also, multiple stages may be spread across resources (conceivably, both within and across nodes).
RS: I think it may be reasonable to include products as a categorization of in situ workflows. I see "compression" as a word that we need to blast apart for several subsets of definitions to address important and subtle benefits to in situ workflows. For instance, what is the difference in time resolution for the in situ output vs. the simulation? How much additional postprocessing are we able to do by creating transformed or compressed data? Using compression creates a tradeoff of quantity vs. quality that may also directly categorize a workflow.
SZ: I agree with RS. The product may not be necessary for classifying the in situ system, but we should consider classifying the product itself. Even if it�s disjoint, I propose that it is important to have a unified taxonomy for in situ products.
BW: We've been doing extract generation to pull out simplified datasets that we can analyze post hoc. We still consider this in situ because the extract data were generated in situ, even if the eventual analysis is offline. I don't think the data products ought to classify in situ too much but do think that in situ implies that data products have something to do with vis and analysis (the embedding of some post processing operations into the sim).
AK: This is less of an "in situ" terminology question and almost more of a "interactive" or "batch" question -- I think one can conceivably use in situ for both batch and interactive use cases. I suspect the former (maybe with Cinema/Explorable Images/Compression/Simplification) will be more common, unless computational scientists really enjoy watching grass grow, or there's an unexpected breakthrough.
EB: I think explorable vs non-explorable may not impact how the in situ is done, but it is important to the user and explorable is potentially much more useful to the user. I know there are those users that do the same analysis on every run they do, so then this wouldn't make any difference.
VP: I believe most of the transformed data should also be considered a for of data analysis. We should just distinguish between data analysis that leave additional free parameters and is therefore explorable and the analysis that produced simpler/direct results that do not require further exploration.

How it is carried out

dynamic: operations being performed can be changed by the user
- blocking: the simulation waits while the user edits the operations being performed
- non-blocking: the simulation proceeds while the user edits the operations being performed. at some point these changes are integrated into the operations being performed
static: operations being performed are fixed at the beginning, i.e,. they cannot be changed by the user

Comments

Thoughts about the above:

TP: This is a question of whether the workflow graph is cyclic or acyclic. Acyclic graphs (the status quo today) don't have a feedback mechanism and data flows through them in one direction. With cycles, the graph structure itself is still static, but the cycles allow more flexibility for the data flow to modify the operation of the task nodes.
AB: I don't agree that acyclic is the status quo. Look at Catalyst's "Live" tools. Also, as far as I know, LibSim started from what I would characterize as dynamic blocking use case (i.e. cyclic).
AB: Maybe a better breakdown of this would be interactive vs. batch. Batch indicating that a user is not interactively monitoring or modifying the operations being performed and interactive meaning that a user is not actively monitoring and/or modifying the operations being performed. Additionally, a more general use-case is that a user can set up operations to be done a priori, run the simulation starting with that setup, connect to the running simulation (possibly after many time steps) to monitor and possibly adjust operations and disconnect as needed.
CDH: I agree the "interactive" vs "batch" is a good designation. For cyclic (or ~dynamic case), we may need to make a distinction of human in the loop vs simulation feedback. Simulation feedback could be used for dynamic control as well (See note below about Janine's ISAV15 work w/ "triggers")
JB: agree with CDH -- I like "interactive and batch" and further denoting whether human in the loop or simulation feedback is being used to drive the dynamic control.
AK: agree with JB and CDH.
KLM: yes, stick to the conventional terms: interactive and batch.
FS: There could also be the case of when a user monitors the simulation (exploring the full-res data as its produced), without necessarily the ability to steer the simulation. Would this fall under "interactive" or should there be separate sub-categories, e.g. "passive-interactive" vs. "active-interactive"?
JC: Maybe its better to think in terms of three: "interactive, dynamic, static" for "human-controlled changes, automated changes, no changes" and "block vs non-blocking" in each of the first two contexts.
JC: I think passive/active is really a separate issue of "timeliness". When are results available: Immediate, batch, post-process
TF: "static" / "dynamic" is ambiguous: I can imagine an executor that selects between CPU and GPU at runtime. Or (to avoid everything I say having a GPU slant ;-) choosing between Marching Cubes and ray-casting an isosurface at startup. To me that's a "dynamic network executor" and user interaction is orthogonal. Does "steering" have a place here? Though I guess that hits ambiguity between "simulation steering" and "analysis steering"...
AB: I'm trying to think of operations that are inherently automated/adaptive. The closest thing I can think of now is a histogram with just the number of bins and no bounds. I'm leaning here towards interactive vs. batch and orthogonally passive vs. active. I then would categorize the triggers work as active batch.
DP: There is also a hybrid model. A set of static operations are always performed, and the user can dynamically perform other operations (either blocking, or non-blocking... Or both I guess).

How is it developed and executed

Directly compiled with the main simulation code, i.e. direct source code integration
Integrated through some intermediate framework, i.e. Conduit, ADIOS, Glean,
Independent communication through files, sockets etc.

Comments

Thoughts about the above:

TP: These are three different ideas. How a set of operations is compiled and run: as one big executable or separate programs. What software was used is orthogonal. Likewise, what communication mechanism is used.
TP: In my mind, if the communication is through files then it's not in situ. I would limit the scope of communication to shared memory, message passing, and sockets.
AK: Re: TP's comment, what if file IO is not prohibitive for each timestep, and you're mostly using in situ analysis to be able to filter time steps without having to store everything (and doing that on the compute resource)? I agree it's a cop-out but it seems like it could still be "in situ" to me.
CDH: "direct source code integration" is a verbose term for using a library or writing a one-off algorithm in a sim code. With this designation, everything a sim code developer does for analysis is insitu. It's a needed designation, but it's often used to label distributed-memory analysis algorithm research as "insitu". It's confusing b/c the real depth in these papers is the algorithm research. A clear label for this case would be great, even better would be advice on when it is appropriate this type of integration -- but that may be too prescriptive for this effort.
CDH: I agree with Tom about files, except lately I feel that memory mapping, burst buffers, RAM Disks, seem to muddy the waters.
PTB: So I just saw a presentation from Oregon where they used a file based mechanism to (tightly) couple two pieces of a simulation. So I disagree that file based means not insitu. It depends on where the file lives and for how long. If you keep a local file on the node because it is easier to implement than an in-memory hand-off that still makes it an in-situ workflow. Also CDH has a point in that with burst buffers and memory mapped files the designation what we might consider a file is somewhat strange.
AK: agree with PTB, that's in situ.
HT: I guess "file-based data exchange" is not exactly a well-defined term all by itself, and thus IMHO we should not just draw a line based on the term "file". Even today, we can use a "disk-like device" and make it look like memory and v.v. From the programming/integration point of view, it's a very neat way of abstraction because simulations regularly "write out stuff". Hence, one can interface at that point with minimum effort for the simulation programmer. As Timo suggests, it's matter of where the file lives. In this regard, for example, local NVMe devices are an interesting option even today.
KLM: I thought being in situ, in transit, co-processing, or ... suggests how it is best built.
JC: Maybe files are in-situ if they are already being used....but files are not in-situ if they are only used for the visualization.
JC: My full opinion is that files are OK for in-situ, but they represent a different class of in-situ than memory-resident. This is especially true for machines with per-node file systems (not just a network-based one).
TF: IMHO, this is all "integration". Strawman proposal: perhaps "Direct integration", "library integration", and maybe "protocol-based integration"?
TF: What about transparent integration methods, such as using the PMPI layer, or inferring from the simulation's source code?
WB: On the subject of files and data comm between different processing stages: the "difference" here is what layer in the storage/memory hierarchy is being used to buffer/share data. In the (now distant) past, workflow/visual programming systems (e.g., AVS, Khoros) moved data between processing using files, then as things evolved, used shared memory segments. To the user, it didn't matter, except that it was faster when using a buffer that was shallower in the hierarchy (and the obvious limitations of the size of thing that could be placed into what layer of the memory hierarchy).
WB: As such, I suppose my question is the following does a "deep share" or "shallow share" really affect in situ terminology? Does a deep share (e.g., through a file) fundamentally change that a computation is happening while a sim is running? I would argue "probably not". Having some measure of "deep share" vs. "shallow share" is probably useful, but not as meaningful as the distinction of "post hoc" vs. what we are calling "in situ."
WB: I'll put this idea forth: in situ is not the opposite of post hoc; we are chasing the wrong terminology here. :) Instead, the opposite of post hoc processing is concurrent processing. In situ is one way concurrent processing is carried out (shallow share) vs in transit (deep share, which means data movement). Both may be done synchronously or asynchronously.
WB: there is plenty of historical precedent that reinforces this idea. In the 1990s, this kind of work was commonly referred to as concurrent processing or coprocessing (not In Situ processing). See, e.g., http://sda.iu.edu/docs/CoprocSurvey.pdf (Aug 1998), Concurrent Distributed VIsualization and Simulation Steering, Robert Haimes, In Parallel Computational Fluid Dynamics: Implementations and Results Using Parallel Computers, A. Ecer, J. Periaux, N. Satofuka, and S. Taylor (eds), 1995, Elsevier (google for �avs visualization system co-processing�)
HT: We have recently had a number of discussions over here about the mode of integration. Based on these, I see a more or less continuous gradient between the first (direct linking) and third (sockets, files, ...) options above. Assuming we directly compile some form of in situ code into the simulation, what does it actually do? Does it fully process incoming data into pixels? Does it only grab stuff and send it somewhere else? Grab it, convert it, and send it? Grab it, compress it, and send it? Grab it, write it out locally and trigger some downstream processing? etc. In a comprehensive assessment of any particular use case - so one that does not just optimize for overall execution time but also accounts for development and maintanance effort, etc - there may be reasons to follow each of these patterns. So maybe, we cannot delineate options along this dimension as clearly as we'd like for a taxonomy.
KM: I would argue that this category is confounding two things that should be separate. One is the positional relationship between simulation and vis and how they communicate. This is actually already covered under "Access" above and should be removed. The second thing is from a developmental standpoint whether the simulation and vis codes are directly interfaced or go through a middleware layer. Although direct integration often suggests direct memory access and a middleware suggests communication, this is not necessarily the case. It is possible in principle for an I/O middleware layer such as ADIOS or HDF5 to pass buffer pointers rather than copy data, and it is possible for a directly integrated vis library like Catalyst to start by transferring data off-node. Thus, I suggest making this configuration simply about whether a middleware layer is used to interface sim and vis.
EB: I also believe that files should be considered as insitu. It is a temporary storage location, just like main memory. Perhaps the key consideration is that it is transitory and not permanent.

Other considerations

data lifetime: data can be freed/changed by simulation code during vis/analysis execution vs data will not be freed/changed
memory footprint: vis/analysis routines take: (1) an unknown amount of memory, (2) a fixed amount of memory, (3) a user settable amount of memory
- Examples: (1) memory for isosurface may vary since it is data-dependent and thus hard to bound. (2) some algorithms allow you to calculate the exact memory needed in advance. (3) some algorithms can take user parameters to control the amount of memory used, streaming data if necessary to stay within that bound
buffered input of data: what if vis/analysis is being done asynchronously on dedicated resources, and simulation is feeding time slices to vis/analysis routines faster than vis/analysis routines can process them? What are the ideas for handling this problem? Do they in some way define the in situ confugiration?
- TF: this is a standard producer-consumer problem/situation. We should steal terminology from them instead of making something up. That said, I don't follow that community closely enough to know what they call it when they have overzealous producers and lazy consumers. Maybe "America"? ;-)
regular vs adaptive resolution of in situ
- example: Janine Bennett presented work at ISAV with "triggers" that cause in situ analysis to happen more frequently after an event occurs
CDH: In general insitu must place some constraint on the analysis that doesn't exist for traditional post processing.
PTB: (at CD) That classification is a very interesting notion though I am not sure I fully agree. If I render a movie frame it seems I could use exactly the same code in-situ or in post-process. I guess one could argue that the high temporal frequency required might be the additional requirement. So would the idea be to label anything "in-situ" that couldn't be done off-line for some reason or another while everything that could be done later would be post-process (even though for convenience we might do it in-situ anyhow) ? So rendering a frame for each RK step would be in-situ while computing the final yield of an explosion would be post-process even if I happen to do it in parallel after my "main loop" has finished ?
KM: A missing configuration is the generality of the visualization system. Some in situ visualization is a custom, lightweight, one function visualization routine designed specifically for a specific vis for a specific sim. On the other end of the spectrum is a large, general purpose visualization system (such as Catalyst or libsim) that is a full function and flexible service. In my experience this is a pretty big design consideration.
- CDH: Agree w/ KM on the need to classify generality of the system. This concept is what I was grasping for in my comment about "direct source code integration" above.

Comments

Thoughts about the above:

TP: Data lifetime is a function of the in situ mechanism. If it's synchronous, then it's not an issue because the upstream task is blocked. If it's asynchronous, then data must be double-buffered, and the first copy can change.
(AB at TP) certain operations (e.g. particle tracking, auto-correlation, etc.), even if performed synchronously, will require the in situ infrastructures to likely cache/double buffer information from previous time steps.
TP: Resilience is another consideration. Memory overflow is a specific instance of a cause of a hard fault, but in general little has been done to make in situ workflows resilient to hard and soft errors.
TP: The buffering issue is a good one too: in general, little has been done in the way of flow control other than throttling everything to the slowest task or dropping data that cannot be ingested by a slow task. Buffering to other levels of the memory/storage hierarchy is the next research in this direction, but not much exists that is ready for production.
PO: I agree with TP. Resilience is used as used a a differentiating point in past articles, but it seems to loosely applied. It's an important consideration, but concrete implications would provide clarity.

Log

Want to communicate you looked this over? Add your name to the list below:

Hank Childs (HC)
James Kress
Tom Peterka (comments labeled TP)
Andy Bauer (comments labeled as AB)
Cyrus Harrison (comments labeled as CDH)
Janine Bennett (comments as JB)
Dave Pugmire (comments as DP)
Peer-Timo Bremer (comments as PTB)
Kwan-Liu Ma (comments as KLM)
John Patchett (comments as JMP)
Franz Sauer (comments as FS)
Tom Fogal (TF)
Joseph Cottam (comments as JC)
Ken Moreland (comments as KM)
Amit Chourasia (comments as AC)
Patrick O'Leary (comments as PO)
Valerio Pascucci (comments as VP)
Daniel Weiskopf (DW)
Berk Geveci (BG)
Wes Bethel (WB)
Sean Ziegeler (SZ)
David Rogers (DR)
Bernd Hentschel (HT)
Rob Sisneros (RS)
Chuck Hansen (CH)
Brad Whitlock (BW)
Aaron Knoll (AK)
Sean Ahern (SDA)
Jeremy Meredith (JM)
Eric Brugger (EB)
Jean Favre (JF)
Michel Rasquin (MR)

High-level

HC moved this conversation to the end of the page. Phase 1A is about agreeing on configurations. We'll try to figure out terminology later.

(AB) Should we be discussing some terminology that can be used as an umbrella for all of these methods? It seems like in situ seems to be somewhat commonly used for this. It would be nice if if we could get to a point where we could say something like in situ methods for that and really mean in situ, in transit or hybrid in a synchronous or non-synchronous manner...
(JB) Agreed that an umbrella term would be good. I've been involved in papers where we use "concurrent analysis" as the umbrella term for the various techniques.
(DP) Agree that umbrella term(s) would be good. Are the existing terms too overloaded? Do we need to create a new term? Or are there existing terms that will work? How far into the future can we look and ensure the terms aren't outdated in 5, 10, 15, or more years.
(KLM) I would suggest not to introduce yet another new term, nor we need an umbrella term. Our goal should be to (agree among ourselves and) help others use the correct term. In situ is well defined; based on its meaning, strictly it's about processing data while it's still in memory (of the node the simulation runs on) without data movement (except for some minimum data exchange among processors. In transit, hybrid processing, co-processing, and others are just different strategies that many have been using due to the hardware and user settings. I feel our job now is to clearly define each existing term for the community.
(JMP) I agree with KLM and the definition of in situ. In situ is performed on data while it's in memory the first time its in memory. There is a category that I use "peri-processing" that encapsulates both in situ and in transit. These 2 are similar in the requirement for a priori or automated(which is also a priori) decision making by the simulation user.What, where, and when are the main questions to answer for each workflow. "simulation artifacts" or "data products" can be grouped, they range from checkpoint restarts to single values, they are prognostic (required to advance the simulation) or diagnostic (derived from the prognostics). An emerging problem being exposed by burst buffers is length of persistence in some memory and distance to compute elements. These become important for synchronous vs. asynchronous discussions.
(FS) What about Freeprocessing (data interception) as it goes to I/O library? Is that something that shouldn't be considered in situ, but considered "peri-processing"? See Fogal et al. "Freeprocessing: Transparent in situ visualization via data interception" ESPGV 2014
(JC) I agree that "in situ" essentially is the umbrella term. However, I think in situ may involve moving some memory between nodes to (for example) find the global range of values via an all-reduce. Such should be rare, but shouldn't disqualify something that's 99% on-node memory.
(PO) I agree with KLM.
(DW) General, high-level note on the terminology "in situ": it comes with a very different semantics in the context of infovis. Although there is no confusion from the HPC/scivis perspective with what we usually understand under "in situ", it could be briefly clarified - especially if infovis/VA are going into the direction of large data.
(HT) When I look at the past use of the concept of "in situ", I have to agree with KLM et al: it would mean the direct processing of data while it is in memory - at its original location. Yet, the notion has evolved and today many people seem to use it for "everything that happens at similation time". Given current hardware developments, I think we might be overconstraining the definition if we settle for the very strict "directy on-node or nothing" meaning. Given the notion above, I would suggest to delineate it along the temporal dimension (when?) rather than the location dimension (where?), the latter being much too complicated on next gen systems, anyway. At the end of the day, IMHO in situ as not about making it fit a definition but rather it is about a specific requirement: it's a way to reduce I/O and data movement in order to cut down runtime and/or energy consumption. That doesn't mean we have to eliminate data movement altogether.
(BW) I agree with KLM's in situ definition and usage. We've had different terms for style of in situ such as "tightly coupled", "loosely coupled". More recently we've favored "in situ" and "in transit", respectively. These can be combined as well -- it's still all in situ.
AK: in situ: anything that generally involves doing vis/analysis alongside compute, even on a different resource, with the goal of minimizing data movement or storage. In transit is a subset of in situ that implies using a different resource. Not sure about co-processing; I feel like it should involve a different "coprocessor" (i.e. GPU/Phi) but that term and "in transit" have been used interchangeably. Anyone want to help on in transit vs coprocessing?