Preparatory (cid:2) -band oscillations reﬂect spatial gating independently of predictions regarding target identity

Preparatory (cid:2) -band oscillations reﬂect spatial gating independently of predictions regarding target identity. J Neurophysiol modulations of cortical (cid:2) -band oscillations are a reliable index of the voluntary allocation of covert spatial attention. It is currently unclear whether attentional cues containing information about a target’s identity (such as its visual orientation), in addition to its location, might additionally shape preparatory (cid:2) modulations. Here, we explore this question by directly comparing spatial and feature-based attention in the same visual detection task while recording brain activity using magnetoencephalography (MEG). At the behavioral level, preparatory feature-based and spatial attention cues both improved performance and did so independently of each other. Using MEG, we replicated robust (cid:2) lateralization following spatial cues: in preparation for a visual target, (cid:2) power decreased contralaterally and increased ipsilaterally to the attended location. Critically, however, preparatory (cid:2) lateralization was not signiﬁcantly modulated by predictions regarding target identity, as carried via the behaviorally effective feature-based attention cues. Furthermore, nonlateralized (cid:2) power during the cue-target interval did not differentiate between uninformative cues and cues carrying feature-based predic- tions either. Based on these results we propose that preparatory (cid:2) modulations play a role in the gating of information between spatially segregated cortical regions and are therefore particularly well suited for spatial gating of information. NEW & NOTEWORTHY The present work clariﬁes if and how human brain oscillations in the (cid:2) -band support multiple types of anticipatory attention. Using magnetoencephalography, we show that posterior (cid:2) -band oscillations are modulated by predictions regarding the spatial location of an upcoming visual target, but not by feature- based predictions regarding its identity, despite robust behavioral beneﬁts. This provides novel insights into the functional role of preparatory (cid:2) mechanisms and suggests a limited speciﬁcity with which they may operate.

WHEN EXPECTING A task-relevant stimulus, we often have multiple types of predictive information, for example, about its spatial location and identity. In such cases, we can optimize behavior using both space-and feature-based attention (Egner et al. 2008;White et al. 2015). In this study, we investigate whether and how oscillatory rhythms in the ␣-band reflect multiple types of anticipatory attention signals.
The role of ␣-band (8-to 14-Hz) oscillations in spatial attention has been an active area of research for over a decade. In preparation for an upcoming visual target, attention to a particular location decreases posterior ␣ power contralaterally to the attended location and increases ␣ power ipsilaterally to the attended location (Kelly et al. 2006;Sauseng et al. 2005;Siegel et al. 2008;Thut et al. 2006;Worden et al. 2000). Similar ␣ lateralization with attentional allocation is also observed in the somatosensory system when anticipating tactile stimuli (Haegens et al. 2011a;van Ede et al. 2011). Such preparatory ␣ modulations have been proposed to reflect functional inhibition of sensory regions processing irrelevant information and/or a release of inhibition of those regions processing relevant information (Foxe and Snyder 2011;Jensen et al. 2012;Jensen and Mazaheri 2010;Klimesch 2012). Critically, these ␣ modulations have been shown to be functionally relevant, predicting task performance on a trial-by-trial basis Mathewson et al. 2009;Romei et al. 2010;van Dijk et al. 2008).
Despite the wealth of literature on ␣ oscillation in preparatory spatial attention, it is currently not clear whether ␣ lateralization is modulated by content predictions in addition to spatial predictions. Content predictions can be hypothesized to influence preparatory ␣ modulations in at least two ways. First, content predictions may strengthen ␣ modulations. The opportunity to preload the anticipated target identity into working memory might lead to stronger engagement in relevant (contralateral) visual areas. Conversely, identity predictions could also be hypothesized to lead to weaker ␣ modulations, since only those populations coding for the expected target features need to upregulate their excitability (as putatively indexed by the ␣ modulation), as opposed to the larger cortical population coding for all possible target identities.
Here, we independently manipulated spatial and featurebased preparatory attention while recording brain activity using magnetoencephalography (MEG). In contrast to our hypotheses, target-identity predictions did not significantly modulate ␣ oscillations despite having a clear benefit on performance. We propose that ␣ power that can be detected at the spatial scales accessible to MEG does not show significant modulations when task-relevant feature levels compete for representations within brain regions.

Participants
All experimental protocols were reviewed and approved by the Central University Research Ethics Committee of the University of Oxford. Twenty volunteers took part in the experiment (9 male, 18 -35 yr, all right handed). All participants had normal or correctedto-normal vision and were naïve to the purpose of the experiment. Participants gave written informed consent before taking part and received financial compensation (£15/h for sessions involving MEG and £10/h for sessions involving only behavioral testing).

Procedures
To examine the role of brain rhythms in preparatory feature-based and spatial attention, we asked participants to perform a targetdetection task on two peripheral orientation stimuli, which were preceded by a cue. Figure 1 provides a task schematic of the experimental task. Participants completed two experimental sessions on separate days. An initial behavioral session was conducted to provide training on the behavioral task. Task performance was accompanied by MEG recordings in a second session, always completed within one week from the training session.

Task and Stimuli
The task was programmed in MATLAB version 7.10 (MathWorks) and presented using the Psychophysics Toolbox in Matlab (Kleiner et al. 2007). Stimuli were back projected (Panasonic PT D7700E) on a screen at a viewing distance of 120 cm with a spatial resolution of 1,280 by 1,024 pixels and a refresh rate of 60 Hz. Stimuli appeared against a uniform midgray background.
The task involved detecting the presence of a target orientation stimulus (horizontal or vertical grating) within an array of two peripherally presented gratings (left and right visual field). Arrays were preceded by a centrally presented attentional cue that could contain information regarding the target's location (left/right), its identity (vertical/horizontal), or both (see Fig. 1B). Spatial and feature cueing was thus independently manipulated, resulting in the following four different conditions: neutral cueing (N), spatial-only cueing (S), feature-only cueing (F), and combined spatial-feature cueing (SF). A target stimulus was present in 50% of trials. When the target was present, a distracting tilted-orientation grating stimulus (45°) occurred on the other visual field. Target-absent arrays always contained two tilted gratings. These could have the same or different orientations. Participants were instructed to maintain central fixation during task performance.
Trials began with a 1,000-to 1,500-ms (randomly determined) presentation of a central fixation dot (0.1°visual angle in width and height). This was followed by the presentation of a central cue stimulus for 200 ms. The cue stimulus consisted of a diamond shape (1.6°visual angle in width and height) around a plus sign in its center. To cue spatial location (left vs. right) the left or right half of the diamond shape was highlighted in blue or magenta. To cue the feature identity (horizontal vs. vertical) the vertical or horizontal line of the plus sign was highlighted in blue or magenta (see Fig. 1B). The highlighting color was counterbalanced between participants, with the other color serving as the default "background" color of the cue stimulus (i.e., the neutral cue was either all blue or all magenta). Participants were explicitly told that cues did not predict whether the target would be present or absent but that, instead, if the target would be present, it would occur at the cued location and be of the cued Here, the parts of the cue highlighted in blue indicate the relevant feature dimension and/or spatial location while the background color is magenta. Highlight and background color were counterbalanced between participants. Feature and spatial cueing were independently manipulated, resulting in the following 4 conditions: neutral cues (N), feature-only cues (F), spatial-only cues (S), and combined spatial-feature cues (SF). C: behavioral results in the magnetoencephalography (MEG) session. Valid spatial cueing led to shorter reaction times (RTs, in ms) and higher accuracy. Valid feature cueing shortened RTs and improved accuracy. Feature and spatial cueing did not interact. Error bars reflect Ϯ 1 SE. identity (if this information was contained in the cue). Following a cue-target interstimulus interval of 1,200 ms, a stimulus array was presented for 50 ms and then backward masked (after an interval of 67 ms) for 283 ms. Stimulus arrays consisted of either a horizontally or vertically oriented target Gabor patch (Gaussian-vignetted sinusoidal gratings) and a Ϯ45°tilted distracter Gabor patch (target present) or two distracter tilted Gabor patches (target absent). Gabor patch stimuli (diameter: 2°of visual angle) were presented at a contrast for which the target detection task was performed at 75% accuracy (see below for details of the staircasing procedure) with a spatial frequency of three cycles per degree. The Gaussian envelope had a space constant of 0.44°. The orientations of the target and distracter stimuli were pseudorandomly selected on each trial. A target was present on one-half of the trials (randomly selected at the beginning of the experiment). Gabor patches were presented Ϯ5.2°from the vertical meridian and 3°below the horizontal meridian. Backward-mask stimuli were constructed by applying a Gaussian-vignette to the convolution of 100% contrast square-wave gratings at four orientations of 90, 180, 45, and Ϫ45°angle. Stimuli and masks were presented atop 10% contrast luminance pedestals that were present throughout the entire experiment. Participants responded (target present or absent) by making a right-handed index or middle finger button press during a response period of up to 2,000 ms. Response mappings were counterbalanced between participants. The subsequent intertrial interval was 1,000 ms (Ϯ500 ms). All conditions were equiprobable and randomized.
Participants performed a calibration session before the behavioral training and MEG experimental session using an adaptive psychophysical staircase procedure to estimate the threshold contrast for perceiving the Gabors. Task difficulty was adjusted for each participant by titrating the contrast of the Gabor patches for which the target detection task was performed at 75% accuracy when only neutral cues were used.
Behavioral training session. Before the MEG session, on a separate day, participants were trained on the behavioral task to ensure they were familiar with the task and could efficiently use the different cue types. Participants completed a total of 288 trials of the experimental task (split into three runs) in a MEG mock scanner. In each run, short rest periods were introduced every 32 trials. The first run served as a practice run to familiarize participants with the task, and stimulus contrast was set to 50%. All cues were included in this run. The second run comprised the staircase procedure starting at 50% contrast and was repeated if necessary. Only neutral cues were included in the staircase run. In the third run, the contrast was set to the threshold estimated from the staircase, and spatial, feature, and spatial-feature (but no neutral) cues were included.
MEG session. Whole head MEG recordings were acquired with a 306-channel Vectorview system (Elekta-Neuromag, Helsinki, Finland) at the Oxford Center for Human Brain Activity (OHBA). The system contains 306 sensors: 102 magnetometers and 204 orthogonally oriented planar gradiometers (102 latitudinal gradiometers, 102 longitudinal gradiometers). The MEG scanner was housed in a threelayer magnetically shielded room (Imedco). The data were sampled at 1,000 Hz.
Additional electrodes were used to record eye movements (EOG) and heart rate (ECG) during MEG acquisition. A vertical pair of EOG electrodes was placed above and below the left eye to detect blinks. To detect lateral eye movements, a horizontal pair was placed with one electrode to the left of the left orbit and the other electrode to the right of the right orbit. Electrodes were placed on the left and right wrist to record ECG. In addition, eye movements were monitored online and recorded at 1,000 Hz with a remote infrared eye tracker (EyeLink 1000; SR research) controlled via the Psychophysical Toolbox in Matlab version 7.10 (MathWorks). Four magnetic coils served as head-position indicator (HPI) coils. These were positioned on each mastoid bone and on each side of the forehead near the hairline. The positions of the HPI coils relative to three fiducial points and the subject's head shape were digitized using a Polhemus 3D tracking system (EastTrach 3D; Polhemus). HPI coils were activated at the beginning of each block to localize the participant's head with respect to the sensor array and to monitor the subject's head position to correct for any head movements later during analysis.
During the recording session, participants sat in a reclining chair and supported their head against the back and top of the MEG helmet.
Participants were asked to remain as still as possible during the recording session and were continuously monitored by a video camera. Participants were given a fiberoptic button box and made responses using their right and left index finger. Stimuli were projected on a screen placed around 120 cm in front of the subject. MEG data were recorded in five blocks. Participants completed a total of 640 trials of the experimental task split into five blocks of 128 trials each. The four experimental conditions (SF, S, F, N) were presented in randomized order, resulting in a total of 160 trials/condition.

Analysis
Behavioral analysis. Hypotheses regarding the effects of experimental parameters on recall accuracy and reaction times (RTs, in ms) in correct trials were tested using analysis of variance (ANOVA) and t-tests (Bonferroni corrected where appropriate). To guide our interpretation of the MEG results, we only used participants' performance during the MEG session.
MEG analysis. Analysis of the MEG data was performed using a combination of custom-written MATLAB scripts, in-house OHBA software library, SPM8 (Litvak et al. 2011; http://www.fil.ion.ucl. ac.uk/spm), and FieldTrip ; http://www. ru.nl/fcdonders/fieldtrip/). All time samples were corrected with respect to the refresh delay of the projector (measured online with a photodiode). MEG analysis was performed in two main steps as follows: 1) preprocessing of the data to remove artifacts present in the raw data and 2) sensor-level analysis consisting of the time-frequency decomposition of the MEG data to examine the time-frequency dynamics of preparatory ␣-band modulations. The epoch of interest for analyses was the period between cue and target onset.
PREPROCESSING. The raw MEG data were visually inspected to remove any channels with excessive noise. The data were then denoised using Maxfilter signal-space separation (Taulu et al. 2005), and compensated for changes in head position within session using the MaxMove software, both implemented in MaxFilter version 2.2 (Elekta Neuromag). The Maxfilter software works by mathematically transforming the data to a set of virtual sensors, which is made possible by location information provided by the MEG sensor array and each person's continuous head position signal recorded from the HPI coils. Transforming the data into virtual sensors gives a standard reference frame and allows for data to be combined across recordings. The data were subsequently downsampled to 250 Hz. After applying a high-pass filter at 1 Hz, the data were then epoched with respect to each cue onset (from Ϫ2 to ϩ3 s). This time window encompassed both cue-target interval activity (from 0 s onward) and target-related activity (from 1.4 s onward). The resulting epochs were manually inspected for artifacts. Systematic artifacts (including eye blinks and heart beats) were identified and regressed out of the data using the following procedure. Independent component analysis (ICA) was used to decompose the sensor data for each session into 150 temporally independent components (tICs) and associated sensor topographies using FastICA (http://research.ics.aalto.fi/ica/fastica). Artifact components were manually identified by the combined inspection of the spatial topography, time course, kurtosis of the time course, and frequency spectrum for all components. Artifactual components were then rejected by subtracting them out of the data (in the majority of cases 3-5 components were removed from each subject). Finally, data were imported into Fieldtrip and inspected using the semiautomatic rejection tool to discard any remaining trials with excessive variance at 8 -14 Hz. TIME-FREQUENCY ANALYSIS. Time-frequency analysis of the timedomain sensor-space signal was performed using a Fast-Fourier transformation algorithm for the range of frequencies between 3 and 30 Hz. The interval Ϫ1 to ϩ2.5 s centered on cue onset was used for analysis. Data from within a 300-ms sliding time window, which was advanced in steps of 60 ms, was Fourier transformed and multiplied with a Hanning taper in 1-Hz steps. As a result of the 300-ms sliding time window, all depicted time points incorporate signal from Ϯ150 ms around that time point. We then averaged the resulting power spectra over trials within each condition of interest. Magnetometers were discarded, and the power time series in the gradiometer pairs were combined (Cartesian sum, i.e., summing the squares of the latitudinal and longitudinal gradiometer activity and taking the square root), resulting in a 102 sensor combined planar gradiometer map of sensor space time-frequency power.
For each participant, we first contrasted the power spectra between left and right spatial (SF and S) cues in all MEG channels and expressed this as a percentage change {i.e., (left -right)/(left ϩ right)] ϫ 100}. Second, to examine effects of feature information on cueinduced ␣ lateralization, we calculated lateralization indexes for spatial-only (S) cues and for spatial-feature (SF) cues separately and compared them. Concretely, for clusters consisting of left and right MEG channels, we calculated the normalized difference in power between trials in which the target was expected contralateral or ipsilateral to those channels and also expressed this as a percentage change: [(contralateral -ipsilateral)/(contralateral ϩ ipsilateral)] ϫ 100. We subsequently collapsed this contralateral vs. ipsilateral metric across left and right channel clusters. The lateralization indexes were calculated two times using different channel selections: first, including all left and all right MEG channels and, second, focusing on ␣ power in visual areas by focusing on a subset of the most informative MEG channels, which we will refer to as visual channels. To identify visual channels, we selected channels above left and right visual areas by plotting the group-level grand average ␣ lateralization (from the analysis described above, i.e., averaged over all types of spatial cues) and manually selected the five left and right channels that showed the maximal response (power suppression contralateral and increase ipsilateral).
Next, to examine the effects of feature information on ␣ lateralization in more detail, we compared the time courses of contralateral and ipsilateral responses following spatial-feature (SF) vs. spatial-only (S) cues. In contrast to the analysis described above, this time, we completed the analysis on visual channels only.
Finally, to examine effects of feature cueing on nonlateralized ␣ power, we also contrasted power spectra in feature-only (F) cueing to neutral (N) cueing conditions. This analysis was done on all channels and on visual channels.
At the group level, we tested for significance using cluster-based permutation statistics (5,000 permutations, ␣-level 0.05), which circumvent the multiple-comparisons problem by evaluating the full dataspace under a single permutation distribution of the largest cluster (Maris and Oostenveld 2007). Unless noted otherwise, clustering was performance across space (channels) and time (0 -1.4 s from cue onset), with regard to data that were averaged in the apriori-defined frequency band of interest (i.e., the ␣-band; 8 -14 Hz).

Behavioral Results
Repeated-measures ANOVAs were carried out on accuracy and RTs (RTs from correct trials only) with the factors spatial cue (present, absent) and feature cue (present, absent). Behavioral data are depicted in Fig. 1 To assess whether feature information alone improved performance, we compared feature-only with neutral cues and spatial-feature with spatial-only cues using repeated-measures t-tests.

MEG Results
Modulations of posterior ␣-band lateralization by spatial and feature cueing. To assess the spatial cueing effect, we compared ␣-band power following left vs. right spatial cues (S, SF). To this end, we compared ␣ power (averaged over 8 -14 Hz) from 0 to 1.4 s post cue onset, between left and right spatial cues. A cluster permutation test on this contrast in the ␣-band (8 -14 Hz), considering all time points from 0 to 1.4 s and all MEG channels, revealed a significant positive cluster over left posterior sensors from 0.32 to 1.3 s (P ϭ 0.0044) and a just-significant negative cluster over right posterior sensors from 0.44 to 0.92 s after cue onset (P ϭ 0.0452). Figure 2A depicts this contrast, averaged over all significant time points of the positive cluster. Next, we calculated ␣ lateralization indexes, combining left and right cues to calculate contralateral and ipsilateral preparatory activity, and then contrasting the two (see Fig. 2B). To assess how ␣ lateralization following spatial cues was modulated by the presence of feature information, we calculated the ␣ lateralization indexes separately for spatial-feature (SF) and spatial-only (S) cues, first using all left and right MEG channels and then using a subset of the left and right visual channels (see METHODS for details) only. Figure  2C depicts the time courses of the ␣ lateralization indexes for the two conditions using all channels, and Fig. 2D shows the same using visual channels only. A comparison of the ␣ lateralization indexes over all channels (considering all time points and frequencies between 3 and 30 Hz) revealed one significant positive cluster (4 Hz, 0.44 -0.62 s, P ϭ 0.025), showing a stronger (possibly faster) lateralization when the cue contained no feature information, in the spatial-only cueing condition (S). Similarly, this was the case when comparing ␣ lateralization indexes over visual channels only (1 positive cluster, 6 -16 Hz, 0.32-0.62 s, P ϭ 0.004).
It is conceivable that this difference reflects differences in the demands to interpret the cue stimulus. It may be more straightforward to interpret S cues compared with SF cues, leading to a faster deployment of spatial attention with S compared with SF cues. Indeed, and critically, in the most relevant preparatory period just before the anticipated target onset (at 1.4 s), the modulations by spatial attention were virtually indistinguishable between spatial-feature and spatial-only cues (i.e., spatial cues with and without target identity predictions). This was the case no matter what channel selection we used.
Contralateral and ipsilateral ␣-band responses relative to neutral cues. To evaluate ␣ lateralization more carefully, we then looked separately at contralateral and ipsilateral ␣-band power changes in visual channels. To this end, we overlaid the time courses of spatial-only, combined spatial-feature, and neutral cues, normalized by a precue baseline. As suggested by Fig. 3, the attentional modulation (vs. neutral) appeared slightly delayed for combined spatial-feature compared with spatial-only cues in both contra-and ipsilateral visual channels. Indeed, cluster-permutation tests, averaging over ␣-band power and visual channels, revealed a significant difference between ipsilateral responses following spatial-feature vs. spatial-only cues (1 negative cluster from 0.38 to 0.68 s, P ϭ 0.006, see Fig. 3B). However, there was no difference between contralateral responses following spatial-feature vs. spatialonly cues (see Fig. 3A). Again, in both contra-and ipsilateral channels, these conditions became virtually indistinguishable toward the end of the cue-target interval. We further noted that both spatial-feature and spatial-only cues elicited significant contralateral and ipsilateral responses compared with neutral cues (contralateral spatial-feature vs. neutral: 1 negative cluster from 0.44 to 0.62 s, P ϭ 0.024; contralateral spatial-only vs. neutral: 1 negative cluster from 0.44 to 0.56 s, P ϭ 0.047; ipsilateral spatial-feature vs. neutral: 1 positive cluster from 0.86 to 1.4 s, P ϭ 0.002; ipsilateral spatial-only vs. neutral: 1 positive cluster from 0.44 to 1.4 s, P Ͻ 0.001; see Fig. 3, A and B).
Modulations of nonlateralized ␣-band power by feature cueing. As stated above, it is feasible that the observed difference in ␣ lateralization between spatial-feature and spatial-only cues from 0.44 to 0.62 s after cue onset reflects low-level differences in cue processing (as spatial-feature cues are more complex than spatial-only cues). Alternatively, however, this initially reduced ␣ lateralization with feature information could also be due to a global modulation linked to preparatory feature-based attention (Bichot et al. 2005;Jehee et al. 2011;Martinez-Trujillo and Treue 2004;McAdams and Maunsell 2000;Saenz et al. 2002;Serences and Boynton 2007;Treue and Martínez Trujillo 1999) whereby ␣ power might be attenuated in a nonlateralized fashion (thereby potentially countering and/or delaying the lateralization effect). If so, then one would expect such global modulation also to show in the contrast between pure feature (F) and neutral (N) cues. However, cluster permutation tests on the averaged ␣-band power (8 -14 Hz), considering all time points from 0 s to 1.4 s and all channels (Fig. 4A), or all visual channels (Fig. 4B), revealed no significant clusters.
We also explored other frequency bands across all channels and time points but did not find any robust effects. Concretely, we compared power in the predefined (4 -8 Hz)-, ␣ (8 -14 Hz)-, ␤ (14 -30 Hz)-, and ␥ (40 -100 Hz)-bands between feature-only and neutral cues. Cluster-based permutation tests, averaging over respective frequency bands and considering all time points from 0 to 1.4 s and all MEG channels, only revealed one significant positive cluster in the -band from 0.152 to 0.508 s (P ϭ 0.003) that centered on medial frontal channels (see Fig. 5). Although this is a potentially interesting effect, its relative early timing suggests that it, too, may reflect differences in cue processing rather than a sustained effect of anticipatory attention. Moreover, this analysis was only exploratory and did not correct for multiple comparison across frequencies.

DISCUSSION
In this study, we set out to investigate whether predictions about target identity, in addition to predictions about locations, modulate preparatory attentional ␣ modulations. We considered two hypotheses: target identity predictions may strengthen ␣ modulation because of relevant cortical areas showing stronger engagement (e.g., by loading the target template in working memory), or target identity predictions may lead to weaker ␣ modulations, since only selected populations coding for the expected target feature modulate their excitability (as opposed Fig. 3. Time courses of contralateral (A) and ipsilateral (B) ␣ power responses, averaged over visual channels, for SF (blue line) and S (green line) cues, together with ␣ power responses following N cues (black line), time locked to cue onset. The dotted and solid blue bars denote significant differences between N cueing and S cueing conditions and between N and SF cueing conditions, respectively (P Ͻ 0.05). to the larger cortical population coding for all possible target identities). Our results do not provide support for either hypothesis. Preparing for the spatial location of an upcoming visual target stimulus elicited robust ␣ lateralization. However, when observers additionally held predictions about the identity of the upcoming target (and prepared both for the spatial location and the orientation of the upcoming target), ␣ lateralization was not further modulated apart from an early difference that was likely related to delayed deployment of attention with more complex cues (i.e., there was no ␣ modulation in the most relevant preparatory time window). Of course we cannot rule out that both hypothesized mechanisms were at play, counteracting one another (i.e., stronger engagement of a smaller population), but we consider it highly unlikely that this would result in a perfectly balanced activation pattern. It is key to note that the lack of an observed modulation of ␣ lateralization (and nonlateralized power) by feature-based attention was not due to the fact that the feature cues were not effective because they did elicit a clear behavioral performance benefit. Similarly, it is important to highlight that neutral cues were effective at shifting attention, since they provided useful information for preparation (Nobre and Rohenkohl 2014;Posner 1980) and only lack information about the most likely feature/ location of the target.

Relation to Other Studies of Feature-Based Attention and ␣ Oscillations
The role of ␣ power in spatial attention has been studied extensively, and several studies have demonstrated spatially specific modulations of ␣ power in preparation for an upcoming target stimulus (e.g., Sauseng et al. 2005;Thut et al. 2006;Worden et al. 2000), in line with the findings reported here. Fewer studies have assessed the effects of feature-based attention and target identity predictions on ␣ power. Snyder and Foxe (2010) examined whether ␣ mechanisms are also involved in nonspatial preparatory attention using a featuredimension cueing task where either color or motion information was task relevant in different trials. ␣ power modulations were observed bilaterally along the dorsal stream in motion area MT when motion was task relevant or along the ventral stream in color area V4 when color was task relevant. This study seems similar to ours, but there is a key difference between the experimental designs. In our task, participants were cued to attend to local feature identities (horizontal vs. vertical), processed by overlapping neural populations, whereas Snyder and Foxe (2010) cued participants to attend to feature dimensions, processed by large nonoverlapping specialized visual modules, and no specific target identity was known in advance.
In another related study, Mayer and colleagues showed modulation of ␣ power over task-relevant brain regions by stimulus identity predictions (Mayer et al. 2016). Participants viewed a letter sequence consisting of two parts: initially, letter identity was unknown and revealed over successive trials as letter visibility increased. Following that, letter visibility was lowered back down to subthreshold levels. Participants made subjective ratings to each stimulus in a sequence regarding its detectability and identity. This design allowed comparing ␣ power preceding letters whose identity was unknown (in the first half of a sequence) with letters whose identity was known (in the second half of a sequence), under identical sensory input. It was found that prestimulus ␣ power increased over left occipital areas when stimulus identity was known compared with when it was not. Source-space analysis of the effect identified a set of areas in left temporal cortex previously linked to top-down letter processing and supramodal representations of letters. This study seems remarkably similar to ours, since here predictions were made regarding actual feature values as opposed to feature dimensions. However, we used simple orientation stimuli while Mayer et al. (2016) used more complex letter stimuli processed in more high-level areas, a potentially important difference as discussed below.
In contrast to our findings, another recent MEG study (de Lange et al. 2013) reported an increase in preparatory ␣ power over occipital areas when observers held expectations about the feature composition of an upcoming stimulus, in that case its motion direction, compared with when observers did not have expectations. The authors proposed that this increase in ␣ power with expectation was driven by selective modulation of subpopulations of direction-selective cells, challenging our findings. It is possible that modulations in oscillatory activity differ as a function of predictions about motion direction vs. stimulus orientation. However, there are at least two additional relevant points to consider that might account for this apparent discrepancy with our findings. First, neutral and informative cues were not perceptually matched in the previous study, which might have driven differences in their sensory processing that are unrelated to preparatory neural dynamics. The authors acknowledged this possibility. Second, in our task, the cue-related prediction was orthogonal to the decision required for task performance (as participants performed a detection task), whereas in de Lange et al. (2013) it was. When predictions are also informative about the decisions and responses required, they may influence additional neural populations.
Finally, another recent study by van Diepen et al. (2016) examined the role of ␣ activity in spatial and feature-based attention. Participants performed a forced-choice identification task to a letter that appeared inside a target shape presented alongside a distracter shape. Spatial attention was manipulated using predictive cues, whereas feature-based attention was manipulated by varying target-distracter similarity (target shapes were always red, whereas distracter shapes could be presented in one of four possible colors ranging from red to yellow). The authors showed that, even when spatial cues were uninformative, ␣ lateralization to target stimuli emerged in low-similarity distracter trials. Given that van Diepen et al.
(2016) focused on target processing and manipulated featurebased attention by varying target-distracter similarity, their results do not directly inform our question of how target content predictions affect ␣ modulations in preparation for an upcoming target, and it is difficult to relate their findings to our results and the literature discussed above.
Albeit in different ways, the findings by Mayer et al. (2016) and Snyder and Foxe (2010) have thus suggested parallels between the role of ␣ mechanisms in selective spatial attention and feature-based attention. One proposal that could reconcile the findings of these two studies with ours is that ␣ oscillations play a role in gating information between distant nonoverlapping neural populations. We elaborate on this below.

How Specific Can Preparatory ␣ Modulations Be Employed?
The functional role underlying ␣ power modulations has been proposed to be the gating of information flow via active inhibition of non-task-relevant cortical areas (e.g., Jensen and Mazaheri 2010;Klimesch et al. 2007), for example, by modulating neuronal spike rates (Haegens et al. 2011b). However, this account leaves open the specificity with which this mechanism can operate. We speculate that ␣ modulations are predominantly involved in the gating of information processed by large nonoverlapping sensory area, be it between different retinotopical locations or different cortical areas that are specialized for processing different visual features [e.g., dorsal vs. ventral feature dimensions as in Snyder and Foxe (2010) but not in the specific gating of information at the level of particular features that are coded in neighboring cortical columns]. Importantly, this would mean that ␣ mechanisms may be able to support feature-based attention (and identity predictions) only in tasks where information gating occurs between distant cortical areas that are specialized for processing different feature dimensions, or relatively high-level feature representations. Fine-grained overlapping areas coding for early level feature values may not be targetable by ␣ mechanisms.
A likely explanation for this is that (EEG measured) ␣ modulations reflect an aggregate measure that occurs at a larger scale than, for example, orientation columns. This explanation can account for the differences between the findings reported here and the findings reported by Snyder and Foxe (2010) despite the apparent similarity of these two studies. Concretely, we used specific orientations (vertical vs. horizontal) that are represented in interdigitating populations. In contrast, Snyder and Foxe (2010) used non-identity-specific feature dimensions (color vs. motion), which are coded in distant nonoverlapping dorsal vs. ventral regions. Similarly, Mayer et al. (2016) reported ␣ modulations by identity predictions in relatively high-level cortical processing areas. In other words, the crucial distinction is not between feature-based vs. space-based attention but about what relevant populations are engaged in a task, and ␣ modulations cannot operate at a level of specificity such as microcolumns.
The findings of a recent study (Harvey et al. 2013) suggest that it may be possible to measure finer-grained differences in target-induced ␣ with higher spatial resolution using electrocorticography (ECoG). Harvey et al. (2013) recorded separate fMRI and ECoG data from occipital and parietal areas of an epileptic patient while the patient viewed drifting checkerboard bards of varying orientations. BOLD responses were used to estimate population receptive fields (pRF) for each stimulus. V1 electrodes in which pRF surround was stimulated showed increased ␣ power compared with electrodes in which pRF center was stimulated, and this pattern was observed between electrodes only 1 cm apart with partially overlapping visual field representations. These findings show that stimulus-induced ␣ modulations can be highly localized, with differences evident between nearby V1 locations.

Neural Mechanisms of Preparatory Feature-Based Attention
Given that we observed a robust behavioral benefit following feature-only cues, this must be the result of some neural processes. We focused on both global and lateralized modulations of ␣ power and did not observe any robust signature of preparatory feature-based attention. Going beyond our initial hypotheses, we also conducted exploratory analysis in other frequency bands, but these analyses also did not yield conclusive results. We may not have been able to look properly at some putative anticipatory modulatory mechanisms. Additionally, there may have been some exogenous feature-based priming (Theeuwes 2013) given that our cues highlighted the same visual features as the target stimuli. What the neural substrates are of the reported performance benefit by preparatory feature-based cues, alike the ones used here, thus remains an important avenue for future investigations.