The present study deals with the question of how crowding effects, which are interactions among adjacent features or characters, emerges automatically or by so-called higher level controlled processing. Two experiments are presented comparing performances during detecting, localizing, and identifying a flanked target in same strings when the target was defined on the basis of either its form or its category. Detection and localization performances were better for form- relative to category-defined targets whereas the reverse was observed for identification performance. This shows that the interacting information is indeed high level in that it is affected by task settings like the defining target feature and the observers' task set. The results suggest that crowding effects do not emerge due to processes depending on the parameters of stimulus presentation, but due to processes activated by certain task sets.