Bad Node, Bad Node, whatcha gonna do?

I recently switched a site search from using external Google CSE search to using the internal Drupal Core search. This was fairly inconsequential.

I switched to Node search under Config -> Search and Metadata, swapped the CSE block out of my sidebar in Content ->Structure, made a few theme CSS changes to match and I should have been done.

Except for indexing the content, trivial, right? Just run cron a few times and you should be all set!

Only, no. That didn’t seem to work. I kept getting a couple strange errors in the log messages.

Warning: Creating default object from empty value in node_build_content() (line 1398 of /Users/user/sites/sitename/modules/node/node.module)

and

EntityMalformedException: Missing bundle property on entity of type node. in entity_extract_ids() (line 7879 of /Users/user/sites/sitename/includes/common.inc).

After a good deal of searching, and some helpful hints from the #drupal channel I tried a few different modules but none of them addressed the one issue I was having, which was increasing the indexing. I was stuck at 8%.

I finally found this post http://tappetyclick.com/blog/2013/01/14/how-find-bad-node-makes-search-indexing-cause-drupal-cron-fail

Which mentioned that they had found a solution on a Portuguese site, and translated it for us.

My drupal version is a little bit different than the example at tappetyclick, but the gist is: Output the indexing process to the log by adding a watchdog line to  the core file modules/node/node.module”

'watchdog ('cron', "indexing node {$node -> nid}"); // ADD THIS LINE'

This made my node_update_index() look like this:

function node_update_index() {
 $limit = (int)variable_get('search_cron_limit', 100);
 $result = db_query_range("SELECT n.nid FROM {node} n LEFT JOIN {search_dataset} d ON d.type = 'node' AND d.sid = n.nid WHERE d.sid IS NULL OR d.reindex <> 0 ORDER BY d.reindex ASC, n.nid ASC", 0, $limit, array(), array('target' => 'slave'));
foreach ($result as $node) {
 watchdog ('cron', "indexing node {$node -> nid}"); // ADD THIS LINE
 _node_index_node($node);
}
}

Then I ran cron manually and checked the log messages, filtering for cron types only.

cron 01/11/2016 - 5:00pm indexing node 1803 Anonymous 
cron 01/11/2016 - 5:00pm indexing node 702 Anonymous 
cron 01/11/2016 - 5:00pm indexing node 7676 Anonymous 
cron 01/11/2016 - 5:00pm indexing node 6801 Anonymous

Everytime I ran into EntityMalformedException: Missing bundle property on entity of type node in entity_extract_ids(), I’d look at the last node we’d tried to index, this was the bad node, and delete it from the database.

After removing about 40 of these bad nodes the core search index actually reached 100%. /kermitarms


Also published on Medium.

One thought on “Bad Node, Bad Node, whatcha gonna do?

  1. Thanks a lot Stacie for posting this! You have saved me a lot of time tryin’ to figure this out! I couldn’t find a useful solution in drupal.org about this problem but with your sugestion I quickly could find that “bad node”!!!

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes:

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>