'\" te .\" Copyright (c) 2000, 2014, Oracle and/or its affiliates. All rights reserved. .TH cpc_bind_curlwp 3CPC "1 Feb 2011" "SunOS 5.11" "CPU Performance Counters Library Functions" .SH NAME cpc_bind_curlwp, cpc_bind_pctx, cpc_bind_cpu, cpc_unbind, cpc_request_preset, cpc_set_restart \- bind request sets to hardware counters .SH SYNOPSIS .LP .nf cc [ \fIflag\fR\&.\|.\|. ] \fIfile\fR\&.\|.\|. \fB-lcpc\fR [ \fIlibrary\fR\&.\|.\|. ] #include \fBint\fR \fBcpc_bind_curlwp\fR(\fBcpc_t *\fR\fIcpc\fR, \fBcpc_set_t *\fR\fIset\fR, \fBuint_t\fR \fIflags\fR); .fi .LP .nf \fBint\fR \fBcpc_bind_pctx\fR(\fBcpc_t *\fR\fIcpc\fR, \fBpctx_t *\fR\fIpctx\fR, \fBid_t\fR \fIid\fR, \fBcpc_set_t *\fR\fIset\fR, \fBuint_t\fR \fIflags\fR); .fi .LP .nf \fBint\fR \fBcpc_bind_cpu\fR(\fBcpc_t *\fR\fIcpc\fR, \fBprocessorid_t\fR \fIid\fR, \fBcpc_set_t *\fR\fIset\fR, \fBuint_t\fR \fIflags\fR); .fi .LP .nf \fBint\fR \fBcpc_unbind\fR(\fBcpc_t *\fR\fIcpc\fR, \fBcpc_set_t *\fR\fIset\fR); .fi .LP .nf \fBint\fR \fBcpc_request_preset\fR(\fBcpc_t *\fR\fIcpc\fR, \fBint\fR \fIindex\fR, \fBuint64_t\fR \fIpreset\fR); .fi .LP .nf \fBint\fR \fBcpc_set_restart\fR(\fBcpc_t *\fR\fIcpc\fR, \fBcpc_set_t *\fR\fIset\fR); .fi .SH DESCRIPTION .sp .LP These functions program the processor's hardware counters according to the requests contained in the \fIset\fR argument. If these functions are successful, then upon return the physical counters will have been assigned to count events on behalf of each request in the set, and each counter will be enabled as configured. .sp .LP The \fBcpc_bind_curlwp()\fR function binds the set to the calling \fBLWP\fR. If successful, a performance counter context is associated with the \fBLWP\fR that allows the system to virtualize the hardware counters and the hardware sampling to that specific \fBLWP\fR. .sp .LP By default, the system binds the set to the current \fBLWP\fR only. If the \fBCPC_BIND_LWP_INHERIT\fR flag is present in the \fIflags\fR argument, however, any subsequent \fBLWP\fRs created by the current \fBLWP\fR will inherit a copy of the request set. The newly created \fBLWP\fR will have its virtualized 64-bit counters initialized to the preset values specified in \fIset\fR, and the counters will be enabled and begin counting and sampling events on behalf of the new \fBLWP\fR. This automatic inheritance behavior can be useful when dealing with multithreaded programs to determine aggregate statistics for the program as a whole. .sp .LP If the \fBCPC_BIND_LWP_INHERIT\fR flag is specified and any of the requests in the set have the \fBCPC_OVF_NOTIFY_EMT\fR flag set, the process will immediately dispatch a \fBSIGEMT\fR signal to the freshly created \fBLWP\fR so that it can preset its counters appropriately on the new \fBLWP\fR. For the CPC request, this initialization condition can be detected using \fBcpc_set_sample\fR(3CPC) and looking at the counter value for any requests with \fBCPC_OVF_NOTIFY_EMT\fR set. The value of any such counters will be \fBUINT64_MAX\fR. For the SMPL request, no special value returned by \fBcpc_set_sample\fR(3CPC) is prepared to tell the initialization condition of the freshly created LWP. .sp .LP The \fBcpc_bind_pctx()\fR function binds the set to the \fBLWP\fR specified by the \fIpctx\fR-\fIid\fR pair, where \fIpctx\fR refers to a handle returned from \fBlibpctx\fR and \fIid\fR is the ID of the desired \fBLWP\fR in the target process. If successful, a performance counter context is associated with the specified \fBLWP\fR and the system virtualizes the hardware counters to that specific \fBLWP\fR. The \fIflags\fR argument is reserved for future use and must always be \fB0\fR. .sp .LP The \fBcpc_bind_cpu()\fR function binds the set to the specified CPU and measures events occurring on that CPU regardless of which \fBLWP\fR is running. Only one such binding can be active on the specified CPU at a time. As long as any application has bound a set to a CPU, per-LWP counters are unavailable and any attempt to use either \fBcpc_bind_curlwp()\fR or \fBcpc_bind_pctx()\fR returns \fBEAGAIN\fR. .sp .LP The purpose of the flags argument is to modify the behavior of \fBcpc_bind_cpu()\fR to adapt to different calling strategies. .sp .LP Values for the \fIflags\fR argument are defined in \fB\fR as follows: .sp .in +2 .nf #define CPC_FLAGS_DEFAULT 0 #define CPC_FLAGS_NORELE 0x01 #define CPC_FLAGS_NOPBIND 0x02 .fi .in -2 .sp .LP When flags is set to \fBCPC_FLAGS_DEFAULT\fR, the library binds the calling LWP to the measured CPU with \fBprocessor_bind\fR(2). The application must not change its processor binding until after it has unbound the set with \fBcpc_unbind()\fR. .sp .LP The remaining \fIflags\fR may be used individually or bitwise-OR'ed together. .sp .LP When only \fBCPC_FLAGS_NORELE\fR is asserted, the library binds the set to the measured CPU using \fBprocessor_bind()\fR. When the set is unbound using \fBcpc_unbind()\fR, the library will unbind the set but will not unbind the calling thread from the measured CPU. .sp .LP When only \fBCPC_FLAGS_NOPBIND\fR is asserted, the library does not bind the calling thread the measured CPU when binding the counter set, with the expectation that the calling thread is already bound to the measured CPU. If the thread is not bound to the CPU, the function will fail. When the set is unbound using \fBcpc_unbind()\fR, the library will unbind the set and the calling thread from the measured CPU. .sp .LP If both flags are asserted (\fBCPC_FLAGS_NOPBIND\fR|\fBCPC_FLAGS_NORELE\fR), the set is bound and unbound from the measured CPU but the calling thread's CPU binding is never altered. .sp .LP The intended use of \fBCPC_FLAGS_NOPBIND\fR and \fBCPC_FLAGS_NORELE\fR is to allow a thread to cycle through a collection of counter sets without incurring overhead from altering the calling thread's CPU binding unnecessarily. .sp .LP The \fBcpc_request_preset()\fR function updates the preset and current value stored in the indexed request within the currently bound set, thereby changing the starting value for the specified request for the calling \fBLWP\fR only, which takes effect at the next call to \fBcpc_set_restart()\fR. .sp .LP When a performance counter counting on behalf of a request with the \fBCPC_OVF_NOTIFY_EMT\fR flag set overflows, the performance counters are frozen and the \fBLWP\fR to which the set is bound receives a \fBSIGEMT\fR signal. The \fBcpc_set_restart()\fR function can be called from a \fBSIGEMT\fR signal handler function to quickly restart the hardware counters. Counting begins from each request's original preset (see \fBcpc_set_add_request\fR(3CPC)), or from the preset specified in a prior call to \fBcpc_request_preset()\fR. Applications performing performance counter overflow profiling should use the \fBcpc_set_restart()\fR function to quickly restart counting after receiving a \fBSIGEMT\fR overflow signal and recording any relevant program state. .sp .LP When a hardware sampling for a SMPL request with the \fBCPC_OVF_NOTIFY_EMT\fR flag set collected the requested number of SMPL records, the LWP to which the set is bound receives a \fBSIGEMT\fR signal, but the hardware sampling would not be frozen unlike the CPC request. In the signal handler for the \fBSIGEMT\fR, if the application wants to temporarily stop the hardware sampling, \fBcpc_disable\fR(3CPC) can be called to stop the hardware sampling. And, \fBcpc_enable\fR(3CPC) can be called to restart the hardware sampling. .sp .LP The \fBcpc_unbind()\fR function unbinds the set from the resource to which it is bound. All hardware resources associated with the bound set are freed. If the set was bound to a CPU, the calling LWP is unbound from the corresponding CPU according to the policy requested when the set was bound using \fBcpc_bind_cpu()\fR. .SH RETURN VALUES .sp .LP Upon successful completion these functions return 0. Otherwise, -1 is returned and \fBerrno\fR is set to indicate the error. .SH ERRORS .sp .LP Applications wanting to get detailed error values should register an error handler with \fBcpc_seterrhndlr\fR(3CPC). Otherwise, the library will output a specific error description to \fBstderr\fR. .sp .LP These functions will fail if: .sp .ne 2 .mk .na \fB\fBEACCES\fR\fR .ad .RS 11n .rt For \fBcpc_bind_curlwp()\fR, the system has Pentium 4 processors with HyperThreading and at least one physical processor has more than one hardware thread online. See NOTES. .sp For \fBcpc_bind_cpu()\fR, the process does not have the \fIcpc_cpu\fR privilege to access the CPU's counters. .sp For \fBcpc_bind_curlwp()\fR, \fBcpc_bind_cpc()\fR, and \fBcpc_bind_pctx()\fR, access to the requested hypervisor event was denied. .RE .sp .ne 2 .mk .na \fB\fBEAGAIN\fR\fR .ad .RS 11n .rt For \fBcpc_bind_curlwp()\fR and \fBcpc_bind_pctx()\fR, the performance counters are not available for use by the application. .sp For \fBcpc_bind_cpu()\fR, another process has already bound to this CPU. Only one process is allowed to bind to a CPU at a time and only one set can be bound to a CPU at a time. .RE .sp .ne 2 .mk .na \fB\fBEINVAL\fR\fR .ad .RS 11n .rt The set does not contain any requests or \fBcpc_set_add_request()\fR was not called. .sp The value given for an attribute of a request is out of range. .sp The system could not assign a physical counter to each request in the system. See NOTES. .sp One or more requests in the set conflict and might not be programmed simultaneously. .sp The \fIset\fR was not created with the same \fIcpc\fR handle. .sp For \fBcpc_bind_cpu()\fR, the specified processor does not exist. .sp For \fBcpc_unbind()\fR, the set is not bound. .sp For \fBcpc_request_preset()\fR and \fBcpc_set_restart()\fR, the calling \fBLWP\fR does not have a bound set. .RE .sp .ne 2 .mk .na \fB\fBENOSYS\fR\fR .ad .RS 11n .rt For \fBcpc_bind_cpu()\fR, the specified processor is not online. .RE .sp .ne 2 .mk .na \fB\fBENOTSUP\fR\fR .ad .RS 11n .rt The \fBcpc_bind_curlwp()\fR function was called with the \fBCPC_OVF_NOTIFY_EMT\fR flag, but the underlying processor is not capable of detecting counter overflow. .RE .sp .ne 2 .mk .na \fB\fBESRCH\fR\fR .ad .RS 11n .rt For \fBcpc_bind_pctx()\fR, the specified \fBLWP\fR in the target process does not exist. .RE .SH EXAMPLES .LP \fBExample 1 \fRUse hardware performance counters to measure events in a process. .sp .LP The following example demonstrates how a standalone application can be instrumented with the \fBlibcpc\fR(3LIB) functions to use hardware performance counters to measure events in a process. The application performs 20 iterations of a computation, measuring the counter values for each iteration. By default, the example makes use of two counters to measure external cache references and external cache hits. These options are only appropriate for UltraSPARC processors. By setting the EVENT0 and EVENT1 environment variables to other strings (a list of which can be obtained from the \fB-h\fR option of the \fBcpustat\fR(1M) or \fBcputrack\fR(1) utilities), other events can be counted. The \fBerror()\fR routine is assumed to be a user-provided routine analogous to the familiar \fBprintf\fR(3C) function from the C library that also performs an \fBexit\fR(2) after printing the message. .sp .in +2 .nf #include #include #include #include #include #include int main(int argc, char *argv[]) { int iter; char *event0 = NULL, *event1 = NULL; cpc_t *cpc; cpc_set_t *set; cpc_buf_t *diff, *after, *before; int ind0, ind1; uint64_t val0, val1; if ((cpc = cpc_open(CPC_VER_CURRENT)) == NULL) error("perf counters unavailable: %s", strerror(errno)); if ((event0 = getenv("EVENT0")) == NULL) event0 = "EC_ref"; if ((event1 = getenv("EVENT1")) == NULL) event1 = "EC_hit"; if ((set = cpc_set_create(cpc)) == NULL) error("could not create set: %s", strerror(errno)); if ((ind0 = cpc_set_add_request(cpc, set, event0, 0, CPC_COUNT_USER, 0, NULL)) == -1) error("could not add first request: %s", strerror(errno)); if ((ind1 = cpc_set_add_request(cpc, set, event1, 0, CPC_COUNT_USER, 0, NULL)) == -1) error("could not add first request: %s", strerror(errno)); if ((diff = cpc_buf_create(cpc, set)) == NULL) error("could not create buffer: %s", strerror(errno)); if ((after = cpc_buf_create(cpc, set)) == NULL) error("could not create buffer: %s", strerror(errno)); if ((before = cpc_buf_create(cpc, set)) == NULL) error("could not create buffer: %s", strerror(errno)); if (cpc_bind_curlwp(cpc, set, 0) == -1) error("cannot bind lwp%d: %s", _lwp_self(), strerror(errno)); for (iter = 1; iter <= 20; iter++) { if (cpc_set_sample(cpc, set, before) == -1) break; /* ==> Computation to be measured goes here <== */ if (cpc_set_sample(cpc, set, after) == -1) break; cpc_buf_sub(cpc, diff, after, before); cpc_buf_get(cpc, diff, ind0, &val0); cpc_buf_get(cpc, diff, ind1, &val1); (void) printf("%3d: %" PRId64 " %" PRId64 "\en", iter, val0, val1); } if (iter != 21) error("cannot sample set: %s", strerror(errno)); cpc_close(cpc); return (0); } .fi .in -2 .LP \fBExample 2 \fRWrite a signal handler to catch overflow signals. .sp .LP The following example builds on Example 1 and demonstrates how to write the signal handler to catch overflow signals. A counter is preset so that it is 1000 counts short of overflowing. After 1000 counts the signal handler is invoked. .sp .LP The signal handler: .sp .in +2 .nf cpc_t *cpc; cpc_set_t *set; cpc_buf_t *buf; int index; void emt_handler(int sig, siginfo_t *sip, void *arg) { ucontext_t *uap = arg; uint64_t val; if (sig != SIGEMT || sip->si_code != EMT_CPCOVF) { psignal(sig, "example"); psiginfo(sip, "example"); return; } (void) printf("lwp%d - si_addr %p ucontext: %%pc %p %%sp %p\en", _lwp_self(), (void *)sip->si_addr, (void *)uap->uc_mcontext.gregs[PC], (void *)uap->uc_mcontext.gregs[SP]); if (cpc_set_sample(cpc, set, buf) != 0) error("cannot sample: %s", strerror(errno)); cpc_buf_get(cpc, buf, index, &val); (void) printf("0x%" PRIx64"\en", val); (void) fflush(stdout); /* * Update a request's preset and restart the counters. Counters which * have not been preset with cpc_request_preset() will resume counting * from their current value. */ (cpc_request_preset(cpc, ind1, val1) != 0) error("cannot set preset for request %d: %s", ind1, strerror(errno)); if (cpc_set_restart(cpc, set) != 0) error("cannot restart lwp%d: %s", _lwp_self(), strerror(errno)); } .fi .in -2 .sp .LP The setup code, which can be positioned after the code that opens the CPC library and creates a set: .sp .in +2 .nf #define PRESET (UINT64_MAX - 999ull) struct sigaction act; ... act.sa_sigaction = emt_handler; bzero(&act.sa_mask, sizeof (act.sa_mask)); act.sa_flags = SA_RESTART|SA_SIGINFO; if (sigaction(SIGEMT, &act, NULL) == -1) error("sigaction: %s", strerror(errno)); if ((index = cpc_set_add_request(cpc, set, event, PRESET, CPC_COUNT_USER | CPC_OVF_NOTIFY_EMT, 0, NULL)) != 0) error("cannot add request to set: %s", strerror(errno)); if ((buf = cpc_buf_create(cpc, set)) == NULL) error("cannot create buffer: %s", strerror(errno)); if (cpc_bind_curlwp(cpc, set, 0) == -1) error("cannot bind lwp%d: %s", _lwp_self(), strerror(errno)); for (iter = 1; iter <= 20; iter++) { /* ==> Computation to be measured goes here <== */ } cpc_unbind(cpc, set); /* done */ .fi .in -2 .LP \fBExample 3 \fRUse Hardware Performance Counters and Hardware Sampling to Measure Events in a Process .sp .LP The following example demonstrates how a standalone application can be instrumented with the libcpc(3LIB) functions to use hardware performance counters and hardware sampling to measure events in a process on an Intel platform supporting the Precise Event Based Sampling (PEBS). The sample code binds two monitoring events for the hardware performance counters and two monitoring events for the hardware sampling to the current thread. If any monitoring request caused an overflow, the signal handler invoked by a SIGEMT signal retrieves the monitoring results. When the sample code finishes the task that would be coded in the section commented as \fIDo something here\fR, the sample code retrieves the monitoring results and closes the session. .sp .in +2 .nf #include #include #include #include #include #define NEVENTS 4 #define EVENT0 "mem_uops_retired.all_loads" #define EVENT1 "mem_uops_retired.all_stores" #define EVENT2 "uops_retired.all" #define EVENT3 "mem_trans_retired.load_latency" #define RATIO0 0x100000ULL #define RATIO1 0x100000ULL #define RATIO2 0x100000ULL #define RATIO3 0x100000ULL #define PRESET_VALUE0 (UINT64_MAX - RATIO0) #define PRESET_VALUE1 (UINT64_MAX - RATIO1) #define PRESET_VALUE2 (UINT64_MAX - RATIO2) #define PRESET_VALUE3 (UINT64_MAX - RATIO3) typedef struct _rec_names { const char *name; int index; struct _rec_names *next; } rec_names_t; typedef struct _rec_items { uint_t max_idx; rec_names_t *rec_names; } rec_items_t; typedef struct { char *event; uint64_t preset; uint_t flag; cpc_attr_t *attr; int nattr; int *recitems; uint_t rec_count; int idx; int nrecs; rec_items_t *ri; } events_t; static cpc_attr_t attr2[] = {{ "smpl_nrecs", 50 }}; static cpc_attr_t attr3[] = {{ "smpl_nrecs", 10 }, { "ld_lat_threshold", 100 }}; static events_t events[NEVENTS] = { { EVENT0, PRESET_VALUE0, CPC_COUNT_USER | CPC_OVF_NOTIFY_EMT, NULL, 0, NULL, 0, 0, 0 }, { EVENT1, PRESET_VALUE1, CPC_COUNT_USER | CPC_OVF_NOTIFY_EMT, NULL, 0, NULL, 0, 0, 0 }, { EVENT2, PRESET_VALUE2, CPC_COUNT_USER | CPC_OVF_NOTIFY_EMT | CPC_HW_SMPL, attr2, 1, NULL, 0, 0, 0 }, { EVENT3, PRESET_VALUE3, CPC_COUNT_USER | CPC_OVF_NOTIFY_EMT | CPC_HW_SMPL, attr3, 2, NULL, 0, 0, 0 } }; static int err; static cpc_t *cpc; static cpc_set_t *cpc_set; static cpc_buf_t *cpc_buf_sig; /* ARGSUSED */ static void mk_rec_items(void *arg, cpc_set_t *set, int request_index, const char *name, int rec_idx) { events_t *ev = (events_t *)arg; rec_names_t *p, *q, *nn; if ((nn = malloc(sizeof (rec_names_t))) == NULL) return; nn->name = name; nn->index = rec_idx; p = NULL; q = ev->ri->rec_names; while (q != NULL) { if (rec_idx < q->index) break; p = q; q = q->next; } nn->next = q; if (p == NULL) ev->ri->rec_names = nn; else p->next = nn; if (ev->ri->max_idx < rec_idx) ev->ri->max_idx = rec_idx; } static rec_names_t * find_recitem(events_t *ev, int index) { rec_names_t *p = ev->ri->rec_names; while (p != NULL) { if (p->index == index) return (p); else if (p->index > index) return (NULL); else p = p->next; } return (NULL); } static int setup_recitems(events_t *ev) { if ((ev->ri = calloc(1, sizeof (rec_items_t))) == NULL) return (-1); errno = 0; cpc_walk_smpl_recitems_req(cpc, cpc_set, ev->idx, ev, mk_rec_items); if (errno != 0) return (-1); return (0); } static void show_record(uint64_t *rec, events_t *ev) { rec_names_t *item; int i; (void) printf("----------------------------------\en"); for (i = 0; i <= ev->ri->max_idx; i++) { if ((item = find_recitem(ev, i)) == NULL) { continue; } (void) printf("%02d: \"%s\": 0x%" PRIx64 "\en", i, item->name, rec[i]); } (void) printf("----------------------------------\en"); } static void show_buf_header(cpc_buf_t *buf) { hrtime_t ht; uint64_t tick; (void) printf("***************** results *****************\en"); ht = cpc_buf_hrtime(cpc, buf); (void) printf("hrtime: %" PRId64 \en", ht); tick = cpc_buf_tick(cpc, buf); (void) printf("tick: %" PRIu64 \en", tick); } static void show_cpc_buf(cpc_buf_t *buf, events_t *ev) { uint64_t val; (void) printf("Req#%d:"\en", ev->idx); if (cpc_buf_get(cpc, buf, ev->idx, &val) != 0) { err = 1; return; } (void) printf(" counter val: 0x%" PRIx64, val); if (val < ev->preset) (void) printf(" : overflowed\en"); else (void) printf("\en"); } static void show_smpl_buf(cpc_buf_t *buf, events_t *ev) { uint64_t *recb; int i; (void) printf("Req#%d:\en", ev->idx); (void) printf(" retrieved count: %u", ev->rec_count); if (ev->rec_count == ev->nrecs) (void) printf(" : overflowed\en"); else (void) printf("\en"); for (i = 0; i < ev->rec_count; i++) { recb = cpc_buf_smpl_get_record(cpc, buf, ev->idx, i); if (recb == NULL) { err = 1; return; } show_record(recb, ev); } } static int retrieve_results(cpc_buf_t *buf) { int i; int repeat = 0; if (cpc_set_sample(cpc, cpc_set, buf) != 0) { return (-1); } show_buf_header(buf); /* Show CPC results */ for (i = 0; i < NEVENTS; i++) { if (!(events[i].flag & CPC_HW_SMPL)) { /* CPC request */ show_cpc_buf(buf, &events[i]); continue; } /* SMPL request */ if (cpc_buf_smpl_rec_count(cpc, buf, events[i].idx, &events[i].rec_count) != 0) { return (-1); } if (events[i].rec_count > 0) show_smpl_buf(buf, &events[i]); if (events[i].rec_count == events[i].nrecs) repeat++; } /* Show remaining SMPL results */ while (repeat > 0) { if (cpc_set_sample(cpc, cpc_set, buf) != 0) return (-1); repeat = 0; for (i = 0; i < NEVENTS; i++) { if (!(events[i].flag & CPC_HW_SMPL)) { /* CPC request */ continue; } if (cpc_buf_smpl_rec_count(cpc, buf, events[i].idx, &events[i].rec_count) != 0) { return (-1); } if (events[i].rec_count > 0) { (void) printf("For req#%d, more than 1 " "retrieval of the sampling results " "were required. Consider to adjust " "the preset value and smpl_nrecs " "value.\en", i); show_smpl_buf(buf, &events[i]); } if (events[i].rec_count == events[i].nrecs) repeat++; } } /* flushed all SMPL results */ return (0); } /* ARGSUSED */ static void sig_handler(int sig, siginfo_t *sip, void *arg) { (void) fprintf(stdout, "signal handler called\en"); if (sig != SIGEMT || sip == NULL || sip->si_code != EMT_CPCOVF) { err = 1; return; } /* Disable all requests */ if (cpc_disable(cpc) != 0) { err = 1; return; } if (retrieve_results(cpc_buf_sig) != 0) { err = 1; return; } /* Enable all requests */ if (cpc_enable(cpc) != 0) { err = 1; return; } /* Restart and reset requests */ if (cpc_set_restart(cpc, cpc_set) != 0) { err = 1; return; } } int main(void) { struct sigaction sa; events_t *ev; cpc_buf_t *cpc_buf; int i; int result = 0; if ((cpc = cpc_open(CPC_VER_CURRENT)) == NULL) { (void) fprintf(stderr, "cpc_open() failed\en"); exit(1); } if ((cpc_caps(cpc) & CPC_CAP_OVERFLOW_SMPL) == 0) { (void) fprintf(stderr, "OVERFLOW CAP is missing\en"); result = -2; goto cleanup_close; } if ((cpc_caps(cpc) & CPC_CAP_SMPL) == 0) { (void) fprintf(stderr, "HW SMPL CAP is missing\en"); result = -2; goto cleanup_close; } if ((cpc_set = cpc_set_create(cpc)) == NULL) { (void) fprintf(stderr, "cpc_set_create() failed\en"); result = -2; goto cleanup_close; } for (i = 0; i < NEVENTS; i++) { ev = &events[i]; if (ev->flag & CPC_HW_SMPL) { ev->nrecs = ev->attr[0].ca_val; } ev->idx = cpc_set_add_request(cpc, cpc_set, ev->event, ev->preset, ev->flag, ev->nattr, ev->attr); if (ev->idx < 0) { (void) fprintf(stderr, "cpc_set_add_request() failed\en"); result = -2; goto cleanup_set; } if (ev->flag & CPC_HW_SMPL) { if (setup_recitems(ev) != 0) { (void) fprintf(stderr, "setup_recitems() failed\en"); result = -2; goto cleanup_set; } } } if ((cpc_buf = cpc_buf_create(cpc, cpc_set)) == NULL) { (void) fprintf(stderr, "cpc_buf_create() failed\en"); result = -2; goto cleanup_set; } if ((cpc_buf_sig = cpc_buf_create(cpc, cpc_set)) == NULL) { (void) fprintf(stderr, "cpc_buf_create() failed\en"); result = -2; goto cleanup_set; } sa.sa_sigaction = sig_handler; sa.sa_flags = SA_RESTART | SA_SIGINFO; (void) sigemptyset(&sa.sa_mask); if (sigaction(SIGEMT, &sa, NULL) != 0) { (void) fprintf(stderr, "sigaction() failed\en"); result = -2; goto cleanup_set; } if (cpc_bind_curlwp(cpc, cpc_set, 0) != 0) { (void) fprintf(stderr, "cpc_bind_curlwp() failed\en"); result = -2; goto cleanup_set; } /* * ================== * Do something here. * ================== */ if (err) { (void) fprintf(stderr, "Error happened\en"); result = -2; goto cleanup_bind; } (void) cpc_disable(cpc); if (retrieve_results(cpc_buf) != 0) { (void) fprintf(stderr, "retrieve_results() failed\en"); result = -2; goto cleanup_bind; } cleanup_bind: (void) cpc_unbind(cpc, cpc_set); cleanup_set: (void) cpc_set_destroy(cpc, cpc_set); cleanup_close: (void) cpc_close(cpc); return (result); } .fi .in -2 .SH ATTRIBUTES .sp .LP See \fBattributes\fR(5) for descriptions of the following attributes: .sp .sp .TS tab() box; cw(2.75i) |cw(2.75i) lw(2.75i) |lw(2.75i) . ATTRIBUTE TYPEATTRIBUTE VALUE _ Interface StabilityCommitted _ MT-LevelSafe .TE .SH SEE ALSO .sp .LP \fBcpustat\fR(1M), \fBcputrack\fR(1), \fBpsrinfo\fR(1M), \fBprocessor_bind\fR(2), \fBcpc_seterrhndlr\fR(3CPC), \fBcpc_set_sample\fR(3CPC), \fBlibcpc\fR(3LIB), \fBattributes\fR(5) .SH NOTES .sp .LP When a set is bound, the system assigns a physical hardware counter to count on behalf of each request in the set. If such an assignment is not possible for all requests in the set, the bind function returns -1 and sets \fBerrno\fR to \fBEINVAL\fR. The assignment of requests to counters depends on the capabilities of the available counters. Some processors (such as Pentium 4) have a complicated counter control mechanism that requires the reservation of limited hardware resources beyond the actual counters. It could occur that two requests for different events might be impossible to count at the same time due to these limited hardware resources. See the processor manual as referenced by \fBcpc_cpuref\fR(3CPC) for details about the underlying processor's capabilities and limitations. .sp .LP Some processors can be configured to dispatch an interrupt when a physical counter overflows. The most obvious use for this facility is to ensure that the full 64-bit counter values are maintained without repeated sampling. Certain hardware, such as the UltraSPARC processor, does not record which counter overflowed. A more subtle use for this facility is to preset the counter to a value slightly less than the maximum value, then use the resulting interrupt to catch the counter overflow associated with that event. The overflow can then be used as an indication of the frequency of the occurrence of that event. .sp .LP The interrupt generated by the processor might not be particularly precise. That is, the particular instruction that caused the counter overflow might be earlier in the instruction stream than is indicated by the program counter value in the ucontext. .sp .LP When a CPC request is added to a set with the \fBCPC_OVF_NOTIFY_EMT\fR flag set, then as before, the control registers and counter are preset from the 64-bit preset value given. When the flag is set, however, the kernel arranges to send the calling process a \fBSIGEMT\fR signal when the overflow occurs. The \fBsi_code\fR member of the corresponding \fBsiginfo\fR structure is set to \fBEMT_CPCOVF\fR and the \fBsi_addr\fR member takes the program counter value at the time the overflow interrupt was delivered. Counting is disabled until the set is bound again. .sp .LP When a SMPL request is added to a set with the \fBCPC_OVF_NOTIFY_EMT\fR flag set, then as before, the control registers and counter for the sampling are preset from the 64-bit preset value given. When the flag is set, however, the kernel arranges to send the calling process a \fBSIGEMT\fR signal when the hardware collected the requested number of SMPL records for the SMPL request. The \fBsi_code\fR member of the corresponding \fBsiginfo\fR structure is set to \fBEMT_CPCOVF\fR and the \fBsi_addr\fR member takes the program counter value at the time the overflow interrupt for the sampling hardware was delivered. Sampling is kept enabled. .sp .LP If the \fBCPC_CAP_OVERFLOW_PRECISE\fR bit is set in the value returned by \fBcpc_caps\fR(3CPC), the processor is able to determine precisely which counter has overflowed after receiving the overflow interrupt. On such processors, the \fBSIGEMT\fR signal is sent only if a counter overflows and the request that the counter is counting has the \fBCPC_OVF_NOTIFY_EMT\fR flag set. If the capability is not present on the processor, the system sends a \fBSIGEMT\fR signal to the process if any of its requests have the \fBCPC_OVF_NOTIFY_EMT\fR flag set and any counter in its set overflows. .sp .LP Different processors have different counter ranges available, though all processors supported by Solaris allow at least 31 bits to be specified as a counter preset value. Portable preset values lie in the range \fBUINT64_MAX\fR to \fBUINT64_MAX\fR-\fBINT32_MAX\fR. .sp .LP The appropriate preset value will often need to be determined experimentally. Typically, this value will depend on the event being measured as well as the desire to minimize the impact of the act of measurement on the event being measured. Less frequent interrupts and samples lead to less perturbation of the system. .sp .LP If the processor cannot detect counter overflow, bind will fail and return \fBENOTSUP\fR. Only user events can be measured using this technique. See Example 2. .SS "Pentium 4" .sp .LP Most Pentium 4 events require the specification of an event mask for counting. The event mask is specified with the \fIemask\fR attribute. .sp .LP Pentium 4 processors with HyperThreading Technology have only one set of hardware counters per physical processor. To use \fBcpc_bind_curlwp()\fR or \fBcpc_bind_pctx()\fR to measure per-\fBLWP\fR events on a system with Pentium 4 HT processors, a system administrator must first take processors in the system offline until each physical processor has only one hardware thread online (See the \fB-p\fR option to \fBpsrinfo\fR(1M)). If a second hardware thread is brought online, all per-\fBLWP\fR bound contexts will be invalidated and any attempt to sample or bind a CPC set will return \fBEAGAIN\fR. .sp .LP Only one CPC set at a time can be bound to a physical processor with \fBcpc_bind_cpu()\fR. Any call to \fBcpc_bind_cpu()\fR that attempts to bind a set to a processor that shares a physical processor with a processor that already has a CPU-bound set returns an error. .sp .LP To measure the shared state on a Pentium 4 processor with HyperThreading, the \fIcount_sibling_usr\fR and \fIcount_sibling_sys\fR attributes are provided for use with \fBcpc_bind_cpu()\fR. These attributes behave exactly as the \fBCPC_COUNT_USER\fR and \fBCPC_COUNT_SYSTEM\fR request flags, except that they act on the sibling hardware thread sharing the physical processor with the CPU measured by \fBcpc_bind_cpu()\fR. Some CPC sets will fail to bind due to resource constraints. The most common type of resource constraint is an ESCR conflict among one or more requests in the set. For example, the branch_retired event cannot be measured on counters 12 and 13 simultaneously because both counters require the \fBCRU_ESCR2\fR ESCR to measure this event. To measure \fIbranch_retired\fR events simultaneously on more than one counter, use counters such that one counter uses \fBCRU_ESCR2\fR and the other counter uses CRU_ESCR3. See the processor documentation for details.