Today my colleague Roel and I ran into a problem that had us baffled. To set the scene; we were migrating a news aggregation platform to Office 365 consisting of multiple site collections that aggregate news from each other based on SharePoint Search. Each of these site collections is practically identical but strangely enough on some of them the search index was returning one specific field as empty even after multiple re-indexings.
To fast-forward through a few hours of utter frustration; at one point we remembered that we had practically the same problem before but on an on-premise SharePoint environment. The solution back then was to modify the SourceID in the SchemaXML of the field. The SourceID is a property that defines where a field originates (e.g. if it is a built-in field or a site column). Below is an example of what you might find in your field’s SchemaXML:
<Field
Type="Note"
DisplayName="MainText"
[...]
SourceID="{a5a7f91f-b3c6-40e1-ab3d-0499544b8f52}"
/>
The GUID in this example can refer to either a List ID or a Web ID – and as we found out this makes all the difference. On the sites where the field was being indexed, the SourceID referred to the Web ID and on the sites where the field remained empty the field referred to the List ID. While it’s not at all clear where this ID is used in SharePoint Search (or if it’s even used at all), I suspect that if the field is a site column, the SourceID must always refer to the Web ID instead of the List ID.
After we changed the SchemaXML of the faulty fields and scheduled a re-index of the List the contents were finally being indexed again and we sighed a collective breath of relief.