Creating and Updating Nodes Programmatically in Drupal 7

The steps for programmatically creating a node are:

  • Create a PHP object representing the node data
  • Save the object using the node_save() function

While the mechanics are simple, there is an important responsibility involved. The Drupal work flow does data validation before calling the node_save() function; node_save() does no validation. By calling node_save() directly your code takes the responsibility for providing valid data.

Drupal 7 Changes

A quick note for those of you familiar with Drupal 6. You'll notice two changes in Drupal 7:

Body Field Is No Longer Special

In Drupal 6 the body field was special. Specifically, it had a different data structure than other fields and it always existed, even if it wasn't used. With Drupal 7 the body field is a standard field provided by core and is truly optional.

Language

Language specification is required for the node and some fields.

Filter Format

The format of the text field filter is now specified by machine name, not integer. For example: full_html, filtered_html, or plain_text. This is great news for moving contents between different Drupal systems.

Basic Node Creation

The following code assumes an unchanged Drupal 7 Standard installation and will create a Drupal 7 article node.

<?php
/**
 * Basic Node Creation Example for Drupal 7
 *
 * This example:
 * - Assumes a standard Drupal 7 installation
 * - Does not verify that the field values are correct
 */
 
$body_text = 'This is the body text I want entered with the node.';
 
 
$node = new stdClass();
 
$node->type = 'article';
 
node_object_prepare($node);
 
 
$node->title    = 'Node Created Programmatically on ' . date('c');
 
$node->language = LANGUAGE_NONE;

 
$node->body[$node->language][0]['value']   = $body_text;
 
$node->body[$node->language][0]['summary'] = text_summary($body_text);
 
$node->body[$node->language][0]['format']  = 'filtered_html';
 
 
$path = 'content/programmatically_created_node_' . date('YmdHis');
 
$node->path = array('alias' => $path);

 
node_save($node);
?>

Notes:

  • The node_object_prepare() function does a number of useful things:
    • Provides the default values for the status, promote, sticky, and revision flags. These values are type specific if $node->type is defined, and can be modified as required.
      Here is the data structure of $node after node_object_prepare() is called:
          stdClass Object
          (
              [type] => article
              [status] => 1
              [promote] => 1
              [sticky] => 0
              [uid] => 0
              [created] => 1283285249
              [revision] => 
              [comment] => 2
              [menu] => Array
                  (
                      [link_title] => 
                      [mlid] => 0
                      [plid] => 0
                      [menu_name] => main-menu:0
                      [weight] => 0
                      [options] => Array
                          (
                          )
                      [module] => menu
                      [expanded] => 0
                      [hidden] => 0
                      [has_children] => 0
                      [customized] => 0
                      [parent_depth_limit] => 8
                  )
          )
    • Makes the current user the node owner by setting the $node->uid.
      (Note: If you are running from drush and haven't explicitly set a user, the node owner will be the anonymous user, typically UID 0.)
    • Runs hook_prepare and hook_node_prepare on the object.
  • The LANGUAGE_NONE string constant value is "und". It's useful to know this if you're ever looking at the node data values.
  • The $node->body[$node->language][0]['format'] = 2 statement sets the Text format of the Body field to Filtered HTML. It is possible to set text format to a value the node owner isn't permitted to use, which will almost certainly cause undesirable consequences. Format values can be changed from the Administration menu, so be careful about what you assume.
  • Node and Field API hooks called by node_save() may alter the saved data. For example, if the Pathauto module is enabled it will override the path given in this example.
  • A list of Node and Field API hooks called during node_save() is available at the Drupal API Reference Site: Node API Hooks.

Updates

To update a node simply load it, make the changes, and then save it. The following example assumes there is a pre-existing node with the node id (nid) of 1.

<?php
/**
 * Basic Node Update Example for Drupal 7
 *
 * This example:
 * - Assumes a standard Drupal 7 installation
 * - Assumes there is a node with a nid of 1
 */
 
$nid = 1;
 
$node = node_load($nid);
 
$node->title = 'Updated Title Text';
 
node_save($node);
?>

Remember, you are responsible for insuring the data being saved is valid.

Revisions

Revisions are a handy method for keeping a record of changes, programmatic or otherwise. If humans are involved in a review process they can also provide a measure of comfort and clarity by confirming what the program changed and what was preexisting, and providing the human reviewer a mechanism for undoing the change.

The following example assumes there is a pre-existing node with the node id (nid) of 1. The $node->log value isn't required but is useful.

<?php
/**
 * Basic Node Revision Example for Drupal 7
 *
 * This example:
 * - Assumes a standard Drupal 7 installation
 * - Assumes there is a node with a nid of 1
 */
 
$nid = 1;
 
$node = node_load($nid);
 
$node->body[$node->language][0]['value'] .= "\nA line of text added by program.";

 
// Make this change a new revision
 
$node->revision = 1;
 
$node->log = 'This node was programmatically updated at ' . date('c');
 
 
node_save($node);
?>

After running the program the revision tab will display the update:

Revisions

Understanding the Data Structures

There's no master directory of node data structures. The most reliable way to understand the data is checking the code. This may not be easy and I can not provide specific guidance. Here are some places I've found the information required to give me an understanding of a data structure.

  • Trusted examples:
    There are a number of well maintained modules that create Drupal content programmatically. Seeing how they deal with particular fields can be helpful.
  • Node Content Form Code:
    The HTML forms used for adding and editing node content can provide understanding for how a field is properly used. In particular, the form validation. Because the code is generalized it is often difficult to home-in on the correct code unless you're familiar with Form API and how the code is called during a form submit.
  • Field API Documentation:
    The Field API documentation is located at Drupal API Site: Field API.
  • Node and Field Hooks:
    After determining the module responsible for saving the data check the code in applicable node and field hooks.
  • Module schema:
    The data structure used in code is usually an analog of the structure used in the database. Drupal database tables are defined with schema that usually describes each database column. The schema definition can be seen using the Schema module or viewing the .install file of the applicable module.

Viewing Data Structures

The drush command line and/or the Devel module's Execute PHP code block can be used to display data structures. Here are sample commands for displaying a node object:

drush: drush php-eval "print print_r(node_load(12), 1)"

PHP block: dpr(node_load(12));

The "dpr" function uses the PHP print_r function to display the value. The Devel module has a number of other functions which use different display functions (e.g.: var_dump, krumo) and print to either the message area or display in the browser.

If you've discovered a research process, useful technique, or reference I haven't mentioned please be sure to add it to comments.

Note: This article was updated to reflect the final version of the API.

Comments

The other way is to load and submit the node form programmatically but I'm afraid I don't have demo code.

On a somewhat related note, it seems the ability to load nodes by a number of params, as opposed to just the node ID, has disappeared in D7 (see: http://api.drupal.org/api/function/node_load/7). For example, you can do this in D6 to return a node object:

<?php
$my_node = node_load(array('uid' => 1, 'type' => 'some_type'));
?>

Any insight in to why this change was made (I missed the reasoning behind it, I'm guessing performance?) or analogous solutions you'd like to cover in this series you've got going on Dale?

@Anonymous, I think D5/6: drupal_execute() and D7: drupal_form_submit() are the functions you're thinking of: http://api.drupal.org/api/function/drupal_form_submit/7.

Thanks for mentioning it. The function is cool because it works for any Drupal form. It can be very tedious figuring out the Form API array structure, though. I used it for something in Drupal 5, but don't think I've every used it to create nodes.

@Evan:

Here's the issue where the change to node_load() was made:
#25634: Allow node_load($nid).

It appears that entity_load() replaces the Drupal 6 node_load(array('fieldname' => 'value')) option. There's a entity_load() convenience wrapper for nodes: node_load_multiple().

Yep, that's the one, thanks. You're right—the one time I needed it to work, I gave up and used node_save() instead, because of how complex the form API stuff is.

Maybe I'm missing something here, but I don't see how node_object_prepare() can actually affect the $node object: in D7, the variable $node is NOT passed by reference to the function, and the function returns nothing...

But Drupal 7 core itself uses this syntax in node_form(), so there must be something I don't get, but what? ;)

I can't find a good online explanation, which is sad, so here goes mine.

The $node variable is a handle to the object. Although the object handle is passed by value it still points to the same object; for common situations it's essentially the same as passing by reference. Since the object handle in the function and the calling code both point to the same object, changes made in the function are seen outside the function. It's like passing a pointer, but apparently it's more correct to call it a handle.

If someone can point to a concise explanation, that would be great.

OK, thanks for the quick reply!

Do you know if this behavior has been around all along or is it only in recent versions of PHP?

Setting
$node->body[$node->language][0]['format'] = 2;
does not work. Use instead
$node->body[$node->language][0]['format'] = 'filtered_html';

In PHP5, objects are always passed by reference in functions. (In PHP4 this wasn't so.)

Thanks for pointing that out, Anonymous. In one of the alpha releases the method of specifying filter format was changed from from integer to machine name. This is a really great change. It makes the code more transportable across Drupal systems.

Nice post, thanks.

Could you explain where it is possible to put these code snippets? What context or bootstrap is needed?

I would like to import a blog in Drupal (all provided solutions are half good and half bad...). I just want to make some sql queries on the blog database to get title/body and then insert it in Drupal with your examples. Can I make a standalone script or should I include my code somewhere in D7?

Regards.

I added these at the top of the file:

define('DRUPAL_ROOT', getcwd());
require_once DRUPAL_ROOT . '/includes/bootstrap.inc';
drupal_bootstrap(DRUPAL_BOOTSTRAP_FULL);

and that worked.

Is there an easy way to assign tags to a node before saving in this process?

BTW, pathauto alleviates the need for assigning the path for those that may be using pathauto.

Jonah Ellison, thanks for that! I didn't realize, and that's very useful information

awesome and simple explanation thanks.

I used the same its working....Thanks lot...but tell me how to save a node with file upload option?

thank you! adding that worked. I wish more drupal tutorials gave you FULL context / examples. I spent hours trying to get this working. Thanks again.

When updating nodes programmatically do you need to update the safe_value as well as value? If so do you need to run it through a sanitize function?
Many thanks.

I appreciate the great D7 php node creation example. Thanks!

Best regards,

Chris

The node is an object and in PHP objects are always passed by reference in function calls.

One of the cleanest tutorial i have ever seen. Thanks a lot man!

This was really helpful.

If only all Drupal articles were this clear. Lets start a revolution!

If you're updating actual translated nodes, but the fields do not have a language. You'll need to use LANGUAGE_NONE to access that content before saving.


$nids = array(10246, 10856, 10857, 16310, 16157);
foreach ($nids as $n) {
$node = node_load($n);
$node->body[LANGUAGE_NONE][0]['value'] = 'My new content';
$node->revision = 1;
$node->log = 'This node was updated via code at ' . date('c');
node_save($home_node);
}

Hi,

Your node creation code is very much helpful.The code is working fine.But I have to put this code in my custom module.For that this code should be inside a function.Now I am not getting any way how to put this code inside a function or what the other way to incorporate this to my module.Please suggest me ASAP.

Thanks
Avaya

Hi,
by this code we can create node properly.But my problem is that whenever I am clearing the cache the node is created again.But I want only once the node will be created.Please help me.

Redards

Thanks for taking the time to write this up!

That was a great help

There is a big issue with this command in that it bypasses all Drupal access restrictions in updating/adding/or deleting a node. This could lead to huge vulnerabilities if placed in the wrong place and found by the wrong people.

I have implemented a quick check before updating...

if (node_access('update', $node) ) {

node_save($node);

echo "Node with nid " . $node->nid . " updated!\n";

} else {

echo "Access Denied";

}

It worked great for my needs.

In this example, $node is an object. Objects are passed by reference by default in PHP.

Hello,

I am creating node pro-grammatically and using format as filter HTML. But when I am trying to edit that node again...the format of line spacing and all are getting messed up in edit box...

Any clue ?

Thanks,
RajeevK

This saved me from hours MORE frustration at trying to do what I was able to do in 25 minutes after reading this!